pax_global_header 0000666 0000000 0000000 00000000064 13532277265 0014526 g ustar 00root root 0000000 0000000 52 comment=b63df8d66e8306b2608c16be3661248348e78a2f
commons-text-commons-text-1.8/ 0000775 0000000 0000000 00000000000 13532277265 0016466 5 ustar 00root root 0000000 0000000 commons-text-commons-text-1.8/.gitignore 0000664 0000000 0000000 00000000324 13532277265 0020455 0 ustar 00root root 0000000 0000000 # Maven build files
target
*.log
maven-eclipse.xml
build.properties
site-content
*~
# IntelliJ IDEA files
.idea
.iws
*.iml
*.ipr
# Eclipse files
.settings
.classpath
.project
.externalToolBuilders
/.checkstyle
commons-text-commons-text-1.8/.travis.yml 0000664 0000000 0000000 00000001775 13532277265 0020611 0 ustar 00root root 0000000 0000000 # Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
language: java
jdk:
- openjdk8
- openjdk11
- openjdk12
- openjdk13
- openjdk-ea
matrix:
allow_failures:
- jdk: openjdk-ea
script:
- mvn
after_success:
- mvn clean test jacoco:report coveralls:report -Ptravis-jacoco
commons-text-commons-text-1.8/CONTRIBUTING.md 0000664 0000000 0000000 00000014601 13532277265 0020721 0 ustar 00root root 0000000 0000000
Contributing to Apache Commons Text
======================
You have found a bug or you have an idea for a cool new feature? Contributing code is a great way to give something back to
the open source community. Before you dig right into the code there are a few guidelines that we need contributors to
follow so that we can have a chance of keeping on top of things.
Getting Started
---------------
+ Make sure you have a [JIRA account](https://issues.apache.org/jira/).
+ Make sure you have a [GitHub account](https://github.com/signup/free).
+ If you're planning to implement a new feature it makes sense to discuss your changes on the [dev list](https://commons.apache.org/mail-lists.html) first. This way you can make sure you're not wasting your time on something that isn't considered to be in Apache Commons Text's scope.
+ Submit a [Jira Ticket][jira] for your issue, assuming one does not already exist.
+ Clearly describe the issue including steps to reproduce when it is a bug.
+ Make sure you fill in the earliest version that you know has the issue.
+ Find the corresponding [repository on GitHub](https://github.com/apache/?query=commons-),
[fork](https://help.github.com/articles/fork-a-repo/) and check out your forked repository.
Making Changes
--------------
+ Create a _topic branch_ for your isolated work.
* Usually you should base your branch on the `master` or `trunk` branch.
* A good topic branch name can be the JIRA bug id plus a keyword, e.g. `TEXT-123-InputStream`.
* If you have submitted multiple JIRA issues, try to maintain separate branches and pull requests.
+ Make commits of logical units.
* Make sure your commit messages are meaningful and in the proper format. Your commit message should contain the key of the JIRA issue.
* e.g. `TEXT-123: Close input stream earlier`
+ Respect the original code style:
+ Only use spaces for indentation.
+ Create minimal diffs - disable _On Save_ actions like _Reformat Source Code_ or _Organize Imports_. If you feel the source code should be reformatted create a separate PR for this change first.
+ Check for unnecessary whitespace with `git diff` -- check before committing.
+ Make sure you have added the necessary tests for your changes, typically in `src/test/java`.
+ Run all the tests with `mvn clean verify` to assure nothing else was accidentally broken.
Making Trivial Changes
----------------------
The JIRA tickets are used to generate the changelog for the next release.
For changes of a trivial nature to comments and documentation, it is not always necessary to create a new ticket in JIRA.
In this case, it is appropriate to start the first line of a commit with '(doc)' instead of a ticket number.
Submitting Changes
------------------
+ Sign and submit the Apache [Contributor License Agreement][cla] if you haven't already.
* Note that small patches & typical bug fixes do not require a CLA as
clause 5 of the [Apache License](https://www.apache.org/licenses/LICENSE-2.0.html#contributions)
covers them.
+ Push your changes to a topic branch in your fork of the repository.
+ Submit a _Pull Request_ to the corresponding repository in the `apache` organization.
* Verify _Files Changed_ shows only your intended changes and does not
include additional files like `target/*.class`
+ Update your JIRA ticket and include a link to the pull request in the ticket.
If you prefer to not use GitHub, then you can instead use
`git format-patch` (or `svn diff`) and attach the patch file to the JIRA issue.
Additional Resources
--------------------
+ [Contributing patches](https://commons.apache.org/patches.html)
+ [Apache Commons Text JIRA project page][jira]
+ [Contributor License Agreement][cla]
+ [General GitHub documentation](https://help.github.com/)
+ [GitHub pull request documentation](https://help.github.com/articles/creating-a-pull-request/)
+ [Apache Commons Twitter Account](https://twitter.com/ApacheCommons)
+ `#apache-commons` IRC channel on `irc.freenode.net`
[cla]:https://www.apache.org/licenses/#clas
[jira]:https://issues.apache.org/jira/browse/TEXT
commons-text-commons-text-1.8/LICENSE.txt 0000664 0000000 0000000 00000026136 13532277265 0020321 0 ustar 00root root 0000000 0000000
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
commons-text-commons-text-1.8/NOTICE.txt 0000664 0000000 0000000 00000000256 13532277265 0020213 0 ustar 00root root 0000000 0000000 Apache Commons Text
Copyright 2014-2019 The Apache Software Foundation
This product includes software developed at
The Apache Software Foundation (https://www.apache.org/).
commons-text-commons-text-1.8/README.md 0000664 0000000 0000000 00000012635 13532277265 0017754 0 ustar 00root root 0000000 0000000
Apache Commons Text
===================
[](https://travis-ci.org/apache/commons-text)
[](https://coveralls.io/r/apache/commons-text)
[](https://maven-badges.herokuapp.com/maven-central/org.apache.commons/commons-text/)
[](https://javadoc.io/doc/org.apache.commons/commons-text/1.8)
Apache Commons Text is a library focused on algorithms working on strings.
Documentation
-------------
More information can be found on the [Apache Commons Text homepage](https://commons.apache.org/proper/commons-text).
The [Javadoc](https://commons.apache.org/proper/commons-text/apidocs) can be browsed.
Questions related to the usage of Apache Commons Text should be posted to the [user mailing list][ml].
Where can I get the latest release?
-----------------------------------
You can download source and binaries from our [download page](https://commons.apache.org/proper/commons-text/download_text.cgi).
Alternatively you can pull it from the central Maven repositories:
```xml
org.apache.commonscommons-text1.8
```
Contributing
------------
We accept Pull Requests via GitHub. The [developer mailing list][ml] is the main channel of communication for contributors.
There are some guidelines which will make applying PRs easier for us:
+ No tabs! Please use spaces for indentation.
+ Respect the code style.
+ Create minimal diffs - disable on save actions like reformat source code or organize imports. If you feel the source code should be reformatted create a separate PR for this change.
+ Provide JUnit tests for your changes and make sure your changes don't break any existing tests by running ```mvn clean test```.
If you plan to contribute on a regular basis, please consider filing a [contributor license agreement](https://www.apache.org/licenses/#clas).
You can learn more about contributing via GitHub in our [contribution guidelines](CONTRIBUTING.md).
License
-------
This code is under the [Apache Licence v2](https://www.apache.org/licenses/LICENSE-2.0).
See the `NOTICE.txt` file for required notices and attributions.
Donations
---------
You like Apache Commons Text? Then [donate back to the ASF](https://www.apache.org/foundation/contributing.html) to support the development.
Additional Resources
--------------------
+ [Apache Commons Homepage](https://commons.apache.org/)
+ [Apache Issue Tracker (JIRA)](https://issues.apache.org/jira/browse/TEXT)
+ [Apache Commons Twitter Account](https://twitter.com/ApacheCommons)
+ `#apache-commons` IRC channel on `irc.freenode.org`
[ml]:https://commons.apache.org/mail-lists.html
commons-text-commons-text-1.8/RELEASE-NOTES.txt 0000664 0000000 0000000 00000043155 13532277265 0021205 0 ustar 00root root 0000000 0000000 Apache Commons Text
Version 1.8
Release Notes
INTRODUCTION:
This document contains the release notes for the 1.8 version of Apache Commons Text.
Commons Text is a set of utility functions and reusable components for the purpose of processing
and manipulating text that should be of use in a Java environment.
Apache Commons Text is a library focused on algorithms working on strings.
Release 1.8
Changes in this version include:
- New Features
o TEXT-169: Add helper factory method org.apache.commons.text.StringSubstitutor.createInterpolator(). Thanks to Gary Gregory.
o TEXT-170: Add String lookup for host names and IP addresses (DnsStringLookup). Thanks to Gary Gregory.
- Fixed Bugs
o TEXT-167: commons-text web page missing "RELEASE-NOTES-1.7.txt". Thanks to Larry West.
o TEXT-168: (doc) Fixed wrong value for Jaro-Winkler example #117. Thanks to luksan47.
o TEXT-171: StringLookupFactory.addDefaultStringLookups(Map) does not convert keys to lower case. Thanks to Gary Gregory.
- Changes
o Expand Javadoc for StringSubstitutor and friends. Thanks to Gary Gregory.
o [site] checkstyle.version 8.21 -> 8.23. Thanks to Gary Gregory.
Historical list of changes: https://commons.apache.org/proper/commons-textchanges-report.html
For complete information on Apache Commons Text, including instructions on how to submit bug reports,
patches, or suggestions for improvement, see the Apache Apache Commons Text website:
https://commons.apache.org/proper/commons-text
Download it from https://commons.apache.org/proper/commons-text/download_text.cgi
=============================================================================
Apache Commons Text
Version 1.7
Release Notes
INTRODUCTION:
This document contains the release notes for the 1.7 version of Apache Commons Text.
Commons Text is a set of utility functions and reusable components for the purpose of processing
and manipulating text that should be of use in a Java environment.
Apache Commons Text is a library focused on algorithms working on strings.
Changes in this version include:
New features:
o TEXT-148: Add an enum to the lookup package that lists all StringLookups
o TEXT-127: Add a toggle to throw an exception when a variable is unknown in StringSubstitutor Thanks to Jean-Baptiste REICH, Sebb, Don Jeba, Gary Gregory.
o TEXT-138: TextStringBuilder append sub-sequence not consistent with Appendable. Thanks to Neal Johnson, Don Jeba.
o TEXT-152: Fix possible infinite loop in WordUtils.wrap for a regex pattern that would trigger on a match of 0 length Thanks to @CAPS50.
o TEXT-155: Add a generic IntersectionSimilarity measure
Fixed Bugs:
o TEXT-111: WordUtils.wrap must calculate offset increment from wrapOn pattern length Thanks to @CAPS50.
o TEXT-151: Fix the JaroWinklerSimilarity to use StringUtils.equals to test for CharSequence equality
o TEXT-165: ResourceBundleStringLookup.lookup(String) throws MissingResourceException instead of returning null.
Changes:
o TEXT-104: Jaro Winkler Distance refers to similarity Thanks to Sascha Szott.
o TEXT-153: Make prefixSet in LookupTranslator a BitSet Thanks to amirhadadi.
o TEXT-156: Fix the RegexTokenizer to use a static Pattern
o TEXT-157: Remove rounding from JaccardDistance and JaccardSimilarity
o TEXT-162: Update Apache Commons Lang from 3.8.1 to 3.9.
o Update tests from org.assertj:assertj-core 3.12.1 to 3.12.2.
o Update site from com.puppycrawl.tools:checkstyle 8.18 to 8.21.
Historical list of changes: https://commons.apache.org/proper/commons-text/changes-report.html
For complete information on Apache Commons Text, including instructions on how to submit bug reports,
patches, or suggestions for improvement, see the Apache Apache Commons Text website:
https://commons.apache.org/proper/commons-text
Download it from https://commons.apache.org/proper/commons-text/download_text.cgi
=============================================================================
Apache Commons Text
Version 1.6
Release Notes
INTRODUCTION
============
This document contains the release notes for the 1.6 version of Apache Commons
Text. Commons Text is a set of utility functions and reusable components for
the purpose of processing and manipulating text that should be of use in a Java
environment.
This component requires Java 8.
CHANGES
=======
o TEXT-144: Add the resource string bundle string lookup to the default set of lookups
o TEXT-145: Add StringLookupFactory methods for the URL encoder and decoder string lookups
o TEXT-146: org.apache.commons.text.lookup.StringLookupFactory.interpolatorStringLookup() should reuse a singleton instance
o TEXT-147: Add a Base64 encoder string lookup.
Historical list of changes: https://commons.apache.org/proper/commons-text/changes-report.html
For complete information on Apache Commons Text, including instructions on how to submit bug reports,
patches, or suggestions for improvement, see the Apache Apache Commons Text website:
https://commons.apache.org/proper/commons-text
=============================================================================
Apache Commons Text
Version 1.5
Release Notes
INTRODUCTION
============
This document contains the release notes for the 1.5 version of Apache Commons
Text. Commons Text is a set of utility functions and reusable components for
the purpose of processing and manipulating text that should be of use in a Java
environment.
This component requires Java 8.
NEW FEATURES
============
o TEXT-133: Add a XML file XPath string lookup.
o TEXT-134: Add a Properties file string lookup.
o TEXT-135: Add a script string lookup.
o TEXT-136: Add a file string lookup.
o TEXT-137: Add a URL string lookup.
o TEXT-140: Add a Base64 string lookup.
o TEXT-141: Add org.apache.commons.text.lookup.StringLookupFactory.resourceBundleStringLookup(String).
o TEXT-142: Add URL encoder and decoder string lookups.
o TEXT-143: Add constant string lookup like the one in Apache Commons Configuration.
FIXED BUGS
==========
o TEXT-139: Improve JaccardSimilarity computational cost Thanks to Nick Wong.
o TEXT-118: JSON escaping incorrect for the delete control character Thanks to Nandor Kollar.
o TEXT-130: Fixes JaroWinklerDistance: Wrong results due to precision of transpositions Thanks to Jan Martin Keil.
o TEXT-131: JaroWinklerDistance: Calculation deviates from definition Thanks to Jan Martin Keil.
CHANGES
=======
o TEXT-132: Update Apache Commons Lang from 3.7 to 3.8.1
=============================================================================
Apache Commons Text
Version 1.4
Release Notes
INTRODUCTION
============
This document contains the release notes for the 1.4 version of Apache Commons
Text. Commons Text is a set of utility functions and reusable components for
the purpose of processing and manipulating text that should be of use in a Java
environment.
This component requires Java 8.
Changes in this version include:
Fixed Bugs:
o TEXT-120: StringEscapeUtils#unescapeJson does not unescape double quotes and forward slash.
o TEXT-119: Remove mention of SQL escaping from user guide.
o TEXT-123: WordUtils.wrap throws StringIndexOutOfBoundsException when wrapLength is Integer.MAX_VALUE. Thanks to Takanobu Asanuma.
Changes:
o TEXT-121: Update Java requirement from version 7 to 8. Thanks to pschumacher.
o TEXT-122: Allow full customization with new API org.apache.commons.text.lookup.StringLookupFactory.interpolatorStringLookup(Map, StringLookup, boolean).
=============================================================================
Apache Commons Text
Version 1.3
Release Notes
INTRODUCTION
============
This document contains the release notes for the 1.3 version of Apache Commons
Text. Commons Text is a set of utility functions and reusable components for
the purpose of processing and manipulating text that should be of use in a Java
environment.
This component requires Java 7.
NEW FEATURES
=============
o Add Automatic-Module-Name MANIFEST entry for Java 9 compatibility Issue: TEXT-110.
o Add an interpolator string lookup: StringLookupFactory#interpolatorStringLookup() Issue: TEXT-113.
o Add a StrSubstitutor replacement based on interfaces: StringSubstitutor Issue: TEXT-114.
o Add a StrBuilder replacement based on the StringMatcher interface: TextStringBuilder Issue: TEXT-115.
o Add a StrTokenizer replacement based on the StringMatcher interface: StringTokenizer Issue: TEXT-116.
o Add a local host string lookup: LocalHostStringLookup Issue: TEXT-117.
FIXED BUGS
==========
o Build failure with java 9-ea+159 Issue: TEXT-70.
o StrLookup API confusing Issue: TEXT-80.
=============================================================================
Apache Commons Text
Version 1.2
Release Notes
INTRODUCTION
============
This document contains the release notes for the 1.2 version of Apache Commons
Text. Commons Text is a set of utility functions and reusable components for
the purpose of processing and manipulating text that should be of use in a Java
environment.
This component requires Java 7.
JAVA 9 SUPPORT
==============
At our time of release of 1.1, our build succeeds with Java 9-ea build 159,
and we believe all of our features to be Java 9 compatible. However, when we
run "mvn clean site" we have failures.
NEW FEATURES
=============
o TEXT-74: StrSubstitutor: Ability to turn off substitution in values. Thanks to Ioannis Sermetziadis.
o TEXT-97: RandomStringGenerator able to pass multiple ranges to .withinRange(). Thanks to Amey Jadiye.
o TEXT-89: WordUtils.initials support for UTF-16 surrogate pairs. Thanks to Arun Vinud S S.
o TEXT-90: Add CharacterPredicates for ASCII letters (uppercase/lowercase) and arabic numerals.
o TEXT-85: Added CaseUtils class with camel case conversion support. Thanks to Arun Vinud S S.
o TEXT-91: RandomStringGenerator should be able to generate a String with a random length.
o TEXT-102: Add StrLookup.resourceBundleLookup(ResourceBundle).
FIXED BUGS
==========
o TEXT-106: Exception thrown in ExtendedMessageFormat using quotes with custom registry. Thanks to Benoit Moreau.
o TEXT-100: StringEscapeUtils#UnEscapeJson doesn't recognize escape signs correctly. Thanks to Don Jeba.
o TEXT-105: Typo in LongestCommonSubsequence#logestCommonSubsequence. Thanks to Abrasha.
CHANGES
=======
o TEXT-107: Upversion commons-lang to 3.7.
o TEXT-98: Deprecate isDelimiter and use HashSets for delimiter checks. Thanks to Arun Vinud S S.
o TEXT-88: WordUtils should treat an empty delimiter array as no delimiters. Thanks to Amey Jadiye.
o TEXT-93: Update RandomStringGenerator to accept a list of valid characters. Thanks to Amey Jadiye.
o TEXT-92: Update commons-lang dependency to version 3.6.
o TEXT-83: Document that commons-csv should be used in preference to CsvTranslators. Thanks to Amey Jadiye.
o TEXT-67: NumericEntityUnescaper.options - fix TODO.
o TEXT-84: RandomStringGenerator claims to be immutable, but isn't.
=============================================================================
Release Notes for version 1.1
JAVA 9 SUPPORT
==============
At our time of release of 1.1, our build succeeds with Java 9-ea build 159,
and we believe all of our features to be Java 9 compatible. However, when we
run "mvn clean site" we have failures.
NEW FEATURES
============
o TEXT-41: WordUtils.abbreviate support Thanks to Amey Jadiye.
o TEXT-82: Putting WordUtils back in to the codebase Thanks to Amey Jadiye.
o TEXT-81: Add RandomStringGenerator Thanks to djones.
o TEXT-36: RandomStringGenerator: allow users to provide source of randomness
Thanks to Raymond DeCampo.
FIXED BUGS
==========
o TEXT-76: Correct round issue in Jaro Winkler implementation
o TEXT-72: Similar to LANG-1025, clirr fails site build.
CHANGES
=======
o TEXT-39: WordUtils should use toXxxxCase(int) rather than toXxxxCase(char)
Thanks to Amey Jadiye.
=============================================================================
Release Notes for version 1.0
INCOMPATIBLE CHANGES
====================
All package names changed from org.apache.commons.text.beta in 1.0-beta-1 to
org.apache.commons.text in 1.0.
Methods StringEscapeUtils#escapeHtml3Once and StringEscapeUtils#escapeHtml4Once
have been removed; see TEXT-40
JAVA 9 SUPPORT
==============
At our time of release of 1.0, our build succeeds with Java 9-ea build 158,
and we believe all of our features to be Java 9 compatible. However, when we run
"mvn clean site" we have failures.
FIXED BUGS
==========
o TEXT-64: Investigate locale issue in ExtendedMessageFormatTest. Thanks to
chtompki.
o TEXT-69: Resolve PMD/CMD Violations
o TEXT-65: Fixing the 200 checkstyle errors present in 1.0-beta-1.
o TEXT-63: Mutable fields should be private.
REMOVED
=======
o TEXT-40: Escape HTML characters only once: revert.
=============================================================================
Release Notes for version 1.0-beta-1
A NOTE ON THE HISTORY OF THE CODE
=================================
The codebase began in the fall of 2014 as a location for housing algorithms for
operating on Strings that seemed to have a more complex nature than those which
would be considered a needed extension to java.lang. Thus, a new component,
different from Apache Commons Lang was warranted. As the project evolved, it was
noticed that Commons Lang had considerable more text manipulation tools than
the average Java application developer would need or even want. So, we have
decided to move the more esoteric String processing algorithms out of Commons
Lang into Commons Text.
JAVA 9 SUPPORT
==============
At our time of release of 1.0-beta-1, our build succeeds with Java 9-ea build 153,
and we believe all of our features to be Java 9 compatible.
NEW FEATURES
============
o TEXT-56: Move CvsTranslators out of StringEscapeUtils and make them DRY
Thanks to Jarek Strzeleck.
o TEXT-40: Escape HTML characters only once Thanks to Sampanna Kahu.
o TEXT-32: Add LCS similarity and distance
o TEXT-34: Add class to generate random strings
o TEXT-29: Add a builder to StringEscapeUtils
o TEXT-28: Add shell/XSI escape/unescape support
o TEXT-2: Add Jaccard Index and Jaccard Distance Thanks to Don Jeba.
o TEXT-27: Move org.apache.commons.lang3.StringEscapeUtils.java into text
o TEXT-23: Moving from commons-lang, the package org.apache.commons.lang3.text
o TEXT-10: A more complex Levenshtein distance Thanks to Don Jeba.
o TEXT-24: Add coveralls and Travis.ci integration
o TEXT-19: Add alphabet converter Thanks to Eyal Allweil.
o TEXT-13: Create Commons Text logo
o TEXT-7: Write user guide
o TEXT-15: Human name parser
o TEXT-3: Add Cosine Similarity and Cosine Distance
o TEXT-4: Port Myers algorithm from [collections]
o TEXT-1: Add Hamming distance
o TEXT-9: Incorporate String algorithms from Commons Lang Thanks to britter.
FIXED BUGS
==========
Note. We recognize the curiosity of a new component having "fixed bugs," but a
considerable number of files were migrated over from Commons Lang, some of which
needed fixes.
o TEXT-62: Incorporate suggestions from RC2 into 1.0 release.
o TEXT-60: Upgrading Jacoco for Java 9-ea compatibility. Thanks to Lee Adcock.
o TEXT-52: Possible attacks through StringEscapeUtils.escapeEcmaScrip better
javadoc
o TEXT-37: Global vs local source of randomness
o TEXT-38: Fluent API in "RandomStringBuilder"
o TEXT-26: Fix JaroWinklerDistance in the manner of LUCENE-1297
o TEXT-35: Unfinished class Javadoc for CosineDistance
o TEXT-22: LevenshteinDistance reduce memory consumption
o TEXT-5: IP clearance for the names package
o TEXT-11: Work on the string metric, distance, and similarity definitions for
the project
o TEXT-12: Create StringDistanceFrom class that contains a StringMetric and
the "left" side string. This would have a method that accepts the
"right" side string to test. Thanks to Jonathan baker.
o TEXT-8: Change (R) StringMetric.compare(CS left, CS right) to "apply" so
that it is consistent with BiFunction. Thanks to Jonathan Baker.
o TEXT-6: Allow extra information (e.g. Levenshtein threshold) to be stored
as (final) fields in the StringMetric instance. Thanks to Jonathan
Baker.
CHANGES
=======
o TEXT-61: Naming packages org.apache.commons.text.beta Thanks to Lee Adcock.
o TEXT-58: Refactor EntityArrays to have unmodifiableMaps in leu of String[][]
o TEXT-53: Prepare site for 1.0 release
o TEXT-50: Upgrade from commons-parent version 41 to version 42
o TEXT-33: Consolidating since tags at 1.0, removing deprecated methods
o TEXT-16: Improve HumanNameParser
REMOVED
=======
o TEXT-55: Remove WordUtils to be added back in an upcoming 1.X release
o TEXT-51: Remove RandomStringGenerator to be added back in the 1.1 release
o TEXT-31: Remove org.apache.commons.text.names, for later release than 1.0
Historical list of changes: https://commons.apache.org/text/changes-report.html
For complete information on Apache Commons Text, including instructions on how
to submit bug reports, patches, or suggestions for improvement, see the Apache
Apache Commons Text website:
https://commons.apache.org/text/
Have fun!
-Apache Commons Text team commons-text-commons-text-1.8/checkstyle-suppressions.xml 0000664 0000000 0000000 00000005400 13532277265 0024120 0 ustar 00root root 0000000 0000000
commons-text-commons-text-1.8/checkstyle.xml 0000664 0000000 0000000 00000017147 13532277265 0021360 0 ustar 00root root 0000000 0000000
commons-text-commons-text-1.8/license-header.txt 0000664 0000000 0000000 00000001442 13532277265 0022100 0 ustar 00root root 0000000 0000000 /*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/ commons-text-commons-text-1.8/pom.xml 0000664 0000000 0000000 00000037134 13532277265 0020013 0 ustar 00root root 0000000 0000000
4.0.0org.apache.commonscommons-parent48commons-text1.8Apache Commons TextApache Commons Text is a library focused on algorithms working on strings.https://commons.apache.org/proper/commons-textISO-8859-1UTF-81.81.8textorg.apache.commons.text1.8(Java 8+)TEXT12318221texthttps://svn.apache.org/repos/infra/websites/production/commons/content/proper/commons-textsite-content3.0.08.233.1.120.8.43.1.1false0.14.1false1.7RC1truescm:svn:https://dist.apache.org/repos/dist/dev/commons/${commons.componentid}Gary Gregory86fdc7e2a11262cborg.apache.commonscommons-lang33.9org.junit.jupiterjunit-jupiter5.5.1testorg.assertjassertj-core3.13.2testclean verify apache-rat:check clirr:check checkstyle:check spotbugs:check javadoc:javadocorg.apache.ratapache-rat-pluginsite-content/**src/site/resources/download_lang.cgisrc/test/resources/stringEscapeUtilsTestData.txtsrc/site/resources/release-notes/RELEASE-NOTES-*.txtcom.github.siom79.japicmpjapicmp-maven-pluginfalsemaven-checkstyle-plugin${checkstyle.plugin.version}false${basedir}/checkstyle.xml${basedir}/license-header.txt${basedir}/checkstyle-suppressions.xml${basedir}/checkstyle-suppressions.xmltruecom.puppycrawl.toolscheckstyle${checkstyle.version}com.github.spotbugsspotbugs-maven-plugin${spotbugs.plugin.version}${basedir}/sb-excludes.xmlmaven-assembly-pluginsrc/assembly/bin.xmlsrc/assembly/src.xmlgnuorg.apache.maven.pluginsmaven-jar-plugintest-jar${commons.module.name}org.apache.maven.pluginsmaven-scm-publish-pluginjavadocsorg.apache.maven.pluginsmaven-javadoc-plugin${maven.compiler.source}maven-checkstyle-plugin${checkstyle.plugin.version}false${basedir}/checkstyle.xml${basedir}/license-header.txt${basedir}/checkstyle-suppressions.xml${basedir}/checkstyle-suppressions.xmltruecheckstylecom.github.spotbugsspotbugs-maven-plugin${spotbugs.plugin.version}${basedir}/sb-excludes.xmlorg.codehaus.mojoclirr-maven-plugininfomaven-pmd-plugin3.12.0${maven.compiler.target}pmdcpdorg.codehaus.mojotaglist-maven-plugin2.4Needs WorkTODOexactFIXMEexactXXXexactNoteable MarkersNOTEexactNOPMDexactNOSONARexactorg.codehaus.mojojavancss-maven-plugin2014kinowBruno P. Kinoshitakinow@apache.orgbritterBenedikt Ritterbritter@apache.orgchtompkiRob Tompkinschtompki@apache.orgggregoryGary Gregoryggregory@apache.orgdjonesDuncan Jonesdjones@apache.orgDon Jebadonjeba@yahoo.comSampanna KahuJarek StrzeleckiLee AdcockAmey Jadiyeameyjadiye@gmail.comArun Vinud S SIoannis SermetziadisJostein TveitLuciano MedalliaJan Martin KeilNandor KollarNick Wongscm:git:http://gitbox.apache.org/repos/asf/commons-textscm:git:https://gitbox.apache.org/repos/asf/commons-texthttps://gitbox.apache.org/repos/asf?p=commons-text.gitjirahttps://issues.apache.org/jira/browse/TEXTapache.websiteApache Commons Sitescm:svn:https://svn.apache.org/repos/infra/websites/production/commons/content/proper/commons-text/setup-checkoutsite-contentorg.apache.maven.pluginsmaven-antrun-pluginprepare-checkoutrunpre-sitejava9+[9,)truejava13+[13,)truetrue
commons-text-commons-text-1.8/sb-excludes.xml 0000664 0000000 0000000 00000002341 13532277265 0021426 0 ustar 00root root 0000000 0000000
commons-text-commons-text-1.8/src/ 0000775 0000000 0000000 00000000000 13532277265 0017255 5 ustar 00root root 0000000 0000000 commons-text-commons-text-1.8/src/assembly/ 0000775 0000000 0000000 00000000000 13532277265 0021074 5 ustar 00root root 0000000 0000000 commons-text-commons-text-1.8/src/assembly/bin.xml 0000664 0000000 0000000 00000003061 13532277265 0022366 0 ustar 00root root 0000000 0000000
bintar.gzzipfalseLICENSE.txtNOTICE.txtRELEASE-NOTES.txttarget*.jartarget/site/apidocsapidocs
commons-text-commons-text-1.8/src/assembly/src.xml 0000664 0000000 0000000 00000003312 13532277265 0022404 0 ustar 00root root 0000000 0000000
srctar.gzzip${project.artifactId}-${commons.release.version}-srccheckstyle.xmlcheckstyle-suppressions.xmlCONTRIBUTING.mdsb-excludes.xmlLICENSE.txtlicense-header.txtNOTICE.txtpom.xmlPROPOSAL.htmlREADME.mdRELEASE-NOTES.txtsrc
commons-text-commons-text-1.8/src/changes/ 0000775 0000000 0000000 00000000000 13532277265 0020665 5 ustar 00root root 0000000 0000000 commons-text-commons-text-1.8/src/changes/changes.xml 0000664 0000000 0000000 00000044536 13532277265 0023033 0 ustar 00root root 0000000 0000000
Apache Commons Text Changescommons-text web page missing "RELEASE-NOTES-1.7.txt"(doc) Fixed wrong value for Jaro-Winkler example #117Add helper factory method org.apache.commons.text.StringSubstitutor.createInterpolator().Add String lookup for host names and IP addresses (DnsStringLookup).StringLookupFactory.addDefaultStringLookups(Map) does not convert keys to lower case.Expand Javadoc for StringSubstitutor and friends.[site] checkstyle.version 8.21 -> 8.23.WordUtils.wrap must calculate offset increment from wrapOn pattern lengthJaro Winkler Distance refers to similarityAdd an enum to the lookup package that lists all StringLookupsAdd a toggle to throw an exception when a variable is unknown in StringSubstitutorTextStringBuilder append sub-sequence not consistent with Appendable.Fix possible infinite loop in WordUtils.wrap for a regex pattern that would trigger on a match of 0 lengthMake prefixSet in LookupTranslator a BitSetFix the RegexTokenizer to use a static PatternRemove rounding from JaccardDistance and JaccardSimilarityFix the JaroWinklerSimilarity to use StringUtils.equals to test for CharSequence equalityAdd a generic IntersectionSimilarity measureUpdate Apache Commons Lang from 3.8.1 to 3.9.ResourceBundleStringLookup.lookup(String) throws MissingResourceException instead of returning null.Update tests from org.assertj:assertj-core 3.12.1 to 3.12.2.Update site from com.puppycrawl.tools:checkstyle 8.18 to 8.21.Add the resource string bundle string lookup to the default set of lookupsAdd StringLookupFactory methods for the URL encoder and decoder string lookupsorg.apache.commons.text.lookup.StringLookupFactory.interpolatorStringLookup() should reuse a singleton instanceAdd a Base64 encoder string lookup.Improve JaccardSimilarity computational costJSON escaping incorrect for the delete control characterFixes JaroWinklerDistance: Wrong results due to precision of transpositionsJaroWinklerDistance: Calculation deviates from definitionUpdate Apache Commons Lang from 3.7 to 3.8.1Add a XML file XPath string lookup.Add a Properties file string lookup.Add a script string lookup.Add a file string lookup.Add a URL string lookup.Add a Base64 string lookup.Add org.apache.commons.text.lookup.StringLookupFactory.resourceBundleStringLookup(String).Add URL encoder and decoder string lookups.Add constant string lookup like the one in Apache Commons Configuration.StringEscapeUtils#unescapeJson does not unescape double quotes and forward slashRemove mention of SQL escaping from user guideUpdate Java requirement from version 7 to 8.Allow full customization with new API org.apache.commons.text.lookup.StringLookupFactory.interpolatorStringLookup(Map<String, StringLookup>, StringLookup, boolean).WordUtils.wrap throws StringIndexOutOfBoundsException when wrapLength is Integer.MAX_VALUE.Add Automatic-Module-Name MANIFEST entry for Java 9 compatibilityBuild failure with java 9-ea+159Add an interpolator string lookup: StringLookupFactory#interpolatorStringLookup()Add a StrSubstitutor replacement based on interfaces: StringSubstitutorAdd a StrBuilder replacement based on the StringMatcher interface: TextStringBuilderAdd a StrTokenizer replacement based on the StringMatcher interface: StringTokenizerAdd a local host string lookup: LocalHostStringLookupStrLookup API confusingUpversion commons-lang to 3.7Exception thrown in ExtendedMessageFormat using quotes with custom registryStringEscapeUtils#UnEscapeJson doesn't recognize escape signs correctlyStrSubstitutor: Ability to turn off substitution in valuesRandomStringGenerator able to pass multiple ranges to .withinRange()Deprecate isDelimiter and use HashSets for delimiter checksWordUtils.initials support for UTF-16 surrogate pairsWordUtils should treat an empty delimiter array as no delimitersUpdate RandomStringGenerator to accept a list of valid charactersAdd CharacterPredicates for ASCII letters (uppercase/lowercase) and arabic numeralsAdded CaseUtils class with camel case conversion supportRandomStringGenerator should be able to generate a String with a random lengthUpdate commons-lang dependency to version 3.6Document that commons-csv should be used in preference to CsvTranslatorsNumericEntityUnescaper.options - fix TODORandomStringGenerator claims to be immutable, but isn'tAdd StrLookup.resourceBundleLookup(ResourceBundle)Typo in LongestCommonSubsequence#logestCommonSubsequenceWordUtils should use toXxxxCase(int) rather than toXxxxCase(char)WordUtils.abbreviate supportPutting WordUtils back in to the codebaseAdd RandomStringGeneratorRandomStringGenerator: allow users to provide source of randomnessCorrect round issue in Jaro Winkler implementationSimilar to LANG-1025, clirr fails site build.Investigate locale issue in ExtendedMessageFormatTestResolve PMD/CMD ViolationsEscape HTML characters only once: revertFixing the 200 checkstyle errors present in 1.0-beta-1Mutable fields should be privateIncorporate suggestions from RC2 into 1.0 releaseNaming packages org.apache.commons.text.betaUpgrading Jacoco for Java 9-ea compatibility.Refactor EntityArrays to have unmodifiableMaps in leu of String[][]Prepare site for 1.0 releaseMove CvsTranslators out of StringEscapeUtils and make them DRYRemove WordUtils to be added back in an upcoming 1.X releasePossible attacks through StringEscapeUtils.escapeEcmaScrip better javadocRemove RandomStringGenerator to be added back in the 1.1 releaseUpgrade from commons-parent version 41 to version 42Escape HTML characters only onceGlobal vs local source of randomnessFluent API in "RandomStringBuilder"Fix JaroWinklerDistance in the manner of LUCENE-1297Add LCS similarity and distanceAdd class to generate random stringsUnfinished class Javadoc for CosineDistanceConsolidating since tags at 1.0, removing deprecated methodsAdd a builder to StringEscapeUtilsAdd shell/XSI escape/unescape supportLevenshteinDistance reduce memory consumptionRemove org.apache.commons.text.names, for later release than 1.0Add Jaccard Index and Jaccard DistanceMove org.apache.commons.lang3.StringEscapeUtils.java into textMoving from commons-lang, the package org.apache.commons.lang3.textA more complex Levenshtein distanceAdd coveralls and Travis.ci integrationAdd alphabet converterCreate Commons Text logoImprove HumanNameParserIP clearance for the names packageWrite user guideWork on the string metric, distance, and similarity definitions for the projectHuman name parserCreate StringDistanceFrom class that contains a StringMetric and the "left" side string. This would have a method that accepts the "right" side string to test.Add Cosine Similarity and Cosine DistanceChange (R) StringMetric.compare(CS left, CS right) to "apply" so that it is consistent with BiFunction.Allow extra information (e.g. Levenshtein threshold) to be stored as (final) fields in the StringMetric instance.Port Myers algorithm from [collections]Add Hamming distanceIncorporate String algorithms from Commons Lang
commons-text-commons-text-1.8/src/changes/release-notes.vm 0000664 0000000 0000000 00000010701 13532277265 0023776 0 ustar 00root root 0000000 0000000 ## Licensed to the Apache Software Foundation (ASF) under one
## or more contributor license agreements. See the NOTICE file
## distributed with this work for additional information
## regarding copyright ownership. The ASF licenses this file
## to you under the Apache License, Version 2.0 (the
## "License"); you may not use this file except in compliance
## with the License. You may obtain a copy of the License at
##
## http://www.apache.org/licenses/LICENSE-2.0
##
## Unless required by applicable law or agreed to in writing,
## software distributed under the License is distributed on an
## "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
## KIND, either express or implied. See the License for the
## specific language governing permissions and limitations
## under the License.
##
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
${project.name}
Version ${version}
Release Notes
INTRODUCTION:
This document contains the release notes for the ${version} version of Apache Commons Text.
Commons Text is a set of utility functions and reusable components for the purpose of processing
and manipulating text that should be of use in a Java environment.
$introduction.replaceAll("(?