NUTCH-1807 Avoid methods relying on system-specific default locale / charset#924
Open
sebastian-nagel wants to merge 12 commits into
Open
NUTCH-1807 Avoid methods relying on system-specific default locale / charset#924sebastian-nagel wants to merge 12 commits into
sebastian-nagel wants to merge 12 commits into
Conversation
Member
|
Excellent initiative, thanks! |
Contributor
Author
|
Code cleaned up. The forbidden API checks can be run per They are not yet integrated into automated workflow runs. |
aae94cc to
8a40ec5
Compare
Contributor
Author
|
…charset Integrate Forbidden API checker into build.
…charset Fix forbidden calls in Nutch core classes. Indexer, field "binaryContent": try to decode binary content using the charset from parse metadata, cf. NUTCH-2773. Code cleanup in updated classes: - remove trailing whitespace - sort and group imports
…charset Move cachepath / taskdef task definitions into target "forbidden-api-checks" to use the ivy lib installed in ivy/
…charset Link to NUTCH-3181 (URL construction deprecation).
The urlfilter-domain unit tests use: src/plugin/urlfilter-domain/data/hosts.txt
…charset Complete classpath for forbidden API checks: add plugin jars
…charset - Remove unnecessary string lowercasing and trim in ProtocolFactory.getProtocol(url). - For efficiency, do only host/domain lookups if there are protocol mappings (default is not). - Add unit tests for per-host and per-domain protocol mappings.
…charset Enable forbidden API checks for plugins, including plugin unit tests. Allow reflection usage, some unit tests require it.
…charset Fix forbidden calls in Nutch core classes, unit tests and plugins. Code cleanup in updated classes: - remove trailing whitespace - sort and group imports - remove unnecessary casts - add missing override annotations
…charset Add license headers to configuration files. Allow tabs in configuration files.
…charset Change target forbidden-api-checks to only verify Nutch core and plugin classes, not verifying test classes. Test classes are only checked if tests are run. Integrate into CI workflows: Run the Forbidden API Checks when any tests are run, that is only when there are changes in Nutch core or plugin classes.
8a40ec5 to
4319278
Compare
- remove tabs in Java code (added lines) - remove @author annotations in change context
b6e9b1b to
b845812
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Integrate the Forbidden API checker into the Nutch build. It uses the Ant cachepath task to call the Forbidden APIs jar, cf. Forbidden API Ant Usage.
TODO:
Note: Yetus failure is caused by required tabs in this file.