The following closely related tools are in a tool suite together with INT Corpus Frontend:
You can cite this software using the following citation generated from its metadata:
Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems Validation of INT Corpus Frontend 3.1.1 was successful (score=3/5), but there are some warnings which should be addressed: 1. Info: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata) 2. Warning: Documentation *SHOULD* be expressed (This is missing in the metadata) 3. Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata) 4. Info: The funder *SHOULD* be acknowledged (This is missing in the metadata) 5. Info: The technology readiness level *SHOULD* be expressed (This is missing in the metadata)
(log file starts at Sat Feb 7 03:06:36 UTC 2026)
[harvester info] --> Processing corpus-frontend (https://github.com/instituutnederlandsetaal/corpus-frontend) [Sat Feb 7 03:06:36 UTC 2026]
[harvester info] Git updating cached clone of https://github.com/instituutnederlandsetaal/corpus-frontend...
[harvester info] Found release v3.1.1
[harvester info] Using 'v3.1.1'
[harvester info] Git reference: v3.1.1
[harvester info] Scanning directory /tmp/codemeta-harvester.cache/corpus-frontend for harvestable resources...
[harvester info] found pom.xml (Java/Maven) for corpus-frontend, converting to codemeta
[harvester info] Looking for license....
[harvester info] No license file found
[harvester info] Getting contributors from git...
[harvester info] Getting top contributor from git...
[harvester info] Git top contributor Koen Mertens <koen.mertens@ivdnt.org> will be assigned as author (and maintainer) if none are found in the metadata
[harvester info] Extracting last and first commit date from git log....
[harvester info] Date created: 2014-03-19T11:00:15Z+0100, date modified: 2024-02-02T16:25:03Z+0300
[harvester info] Querying Github/GitLab API (https://github.com/instituutnederlandsetaal/corpus-frontend)
[harvester info] Adding URL for found README: README.md
[harvester info] Found releaseNotes
[harvester info] Querying Zenodo API for DOI (access token provided)...
[harvester info] Looking for TRL information in README.md...
[harvester info] Looking for repostatus information in README.md...
[harvester info] Looking for continuous integration information in README.md...
[harvester info] Looking for documentation links in README.md...
[harvester info] Falling back to git tag (v3.1.1) if no version number is specified...
[harvester info] Inferring repostatus information from git activity (used only as a fallback if not explicitly provided)...
[harvester info] Inferred repostatus https://www.repostatus.org/#active
[harvester info] Looking for repostatus information in README.md in master branch...
[harvester info] Setting group Blacklab & Corpus Search
[harvester info] Reconciliating: codemetapy --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl --identifier "corpus-frontend" --codeRepository "https://github.com/instituutnederlandsetaal/corpus-frontend" --validate /etc/software.ttl --released --enrich --textv "Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems" -O /tmp/out/corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-version.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-repostatus.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/90-authors.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/43-releasenotes.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/41-readme.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/40-gitapi.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/39-gitdate.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/32-contributors.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/21-java.corpus-frontend.codemeta.json /tmp/codemeta-harvester.cache//tmp/04-applicationSuite.corpus-frontend.codemeta.json
-- begin log --
/usr/lib/python3.12/site-packages/pyshacl/extras/__init__.py:6: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Passed 10 files/sources but specified 0 input types! Automatically guessing types...
Detected input types: [('/tmp/codemeta-harvester.cache//tmp/99-version.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/99-repostatus.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/90-authors.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/43-releasenotes.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/41-readme.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/40-gitapi.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/39-gitdate.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/32-contributors.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/21-java.corpus-frontend.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/04-applicationSuite.corpus-frontend.codemeta.json', 'json')]
Adding to contextgraph: /tmp/turtle
Initial URI automatically generated, may be overriden later: https://tools.clariah.nl/corpus-frontend
Processing source #1 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-version.corpus-frontend.codemeta.json
NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 1 new triples, total is now 2
Processing source #2 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-repostatus.corpus-frontend.codemeta.json
NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 1 new triples, total is now 3
Processing source #3 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/90-authors.corpus-frontend.codemeta.json
Found main resource with URI https://tools.clariah.nl/corpus-frontend.topcontributor/snapshot
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 8 new triples, total is now 10
Processing source #4 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/43-releasenotes.corpus-frontend.codemeta.json
NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 2 new triples, total is now 12
Processing source #5 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/41-readme.corpus-frontend.codemeta.json
NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 1 new triples, total is now 13
Processing source #6 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/40-gitapi.corpus-frontend.codemeta.json
Found main resource with URI https://tools.clariah.nl/corpus-frontend/snapshot
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 15 new triples, total is now 27
Processing source #7 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/39-gitdate.corpus-frontend.codemeta.json
NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] overriding old http://schema.org/dateCreated (2014-07-11T08:18:55Z -> 2014-03-19T11:00:15Z+0100)
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] overriding old http://schema.org/dateModified (2026-02-03T08:43:30Z -> 2024-02-02T16:25:03Z+0300)
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 2 new triples, total is now 27
Processing source #8 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/32-contributors.corpus-frontend.codemeta.json
Found main resource with URI https://tools.clariah.nl/corpus-frontend.contributors/snapshot
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (https://tools.clariah.nl/corpus-frontend)] processed 68 new triples, total is now 90
Processing source #9 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/21-java.corpus-frontend.codemeta.json
Found main resource with URI https://tools.clariah.nl/nl.inl.blacklab.corpus-frontend/3.1.1
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] overriding old http://schema.org/author (https://tools.clariah.nl/stub/H-537c1591b3a28ada -> https://tools.clariah.nl/stub/H-2f1c13f79233b71b)
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] overriding old http://schema.org/codeRepository (https://github.com/instituutnederlandsetaal/blacklab-frontend -> https://github.com/inl/corpus-frontend)
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] overriding old http://schema.org/description (BlackLab Frontend, a feature-rich corpus search interface for BlackLab. -> A web application to search corpora through the BlackLab Server web service.)
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] overriding old https://codemeta.github.io/terms/issueTracker (https://github.com/instituutnederlandsetaal/blacklab-frontend/issues -> https://github.com/INL/corpus-frontend/issues)
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] overriding old http://schema.org/name (blacklab-frontend -> INT Corpus Frontend)
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] overriding old http://schema.org/version (v3.1.1 -> 3.1.1)
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] processed 100 new triples, total is now 178
Processing source #10 of 10
Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/04-applicationSuite.corpus-frontend.codemeta.json
NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (nl.inl.blacklab.corpus-frontend)] processed 1 new triples, total is now 179
Remapping URI to (possibly) new identifier and version component: https://tools.clariah.nl/corpus-frontend -> https://tools.clariah.nl/corpus-frontend/3.1.1
[CODEMETA VALIDATION (corpus-frontend)] done
[CODEMETA ENRICHMENT (corpus-frontend)] Guessing interface type http://schema.org/WebApplication based on clues
[CODEMETA ENRICHMENT (corpus-frontend)] considering first author as maintainer
VALIDATION https://tools.clariah.nl/corpus-frontend/3.1.1 #1: Info: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)
VALIDATION https://tools.clariah.nl/corpus-frontend/3.1.1 #2: Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)
VALIDATION https://tools.clariah.nl/corpus-frontend/3.1.1 #3: Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)
VALIDATION https://tools.clariah.nl/corpus-frontend/3.1.1 #4: Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)
VALIDATION https://tools.clariah.nl/corpus-frontend/3.1.1 #5: Info: The technology readiness level *SHOULD* be expressed (This is missing in the metadata)
-- end log --
[harvester info] Output written to /tmp/out/corpus-frontend.codemeta.json
[harvester info] Harvesting remote service URL https://portal.clarin.ivdnt.org/autocorp/ for corpus-frontend: codemetapy --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl -O "/tmp/codemeta-harvester.cache//tmp/corpus-frontend.codemeta.json" "/tmp/out/corpus-frontend.codemeta.json" "https://portal.clarin.ivdnt.org/autocorp/"
-- begin log --
/usr/lib/python3.12/site-packages/pyshacl/extras/__init__.py:6: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Passed 2 files/sources but specified 0 input types! Automatically guessing types...
Detected input types: [('/tmp/out/corpus-frontend.codemeta.json', 'json'), ('https://portal.clarin.ivdnt.org/autocorp/', 'web')]
Adding to contextgraph: /tmp/turtle
Initial URI automatically generated, may be overriden later: https://tools.clariah.nl/corpus-frontend
Processing source #1 of 2
Parsing json-ld file from /tmp/out/corpus-frontend.codemeta.json
Found main resource with URI https://tools.clariah.nl/corpus-frontend/3.1.1
Injected (possibly temporary) URI https://tools.clariah.nl/corpus-frontend
[CODEMETA COMPOSITION (corpus-frontend)] processed 199 new triples, total is now 199
Processing source #2 of 2
Fallback: Obtaining metadata from remote URL https://portal.clarin.ivdnt.org/autocorp/
Service replied with content-type text/html
Traceback (most recent call last):
File "/usr/bin/codemetapy", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/lib/python3.12/site-packages/codemeta/codemeta.py", line 339, in main
g, res, args, contextgraph = build(**args.__dict__)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/codemeta/codemeta.py", line 692, in build
for targetres in codemeta.parsers.web.parse_web(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/codemeta/parsers/web.py", line 132, in parse_web
raise MiddlewareObstructionException(
codemeta.parsers.web.MiddlewareObstructionException: Unable to extract metadata from https://portal.clarin.ivdnt.org/autocorp/ because it immediately redirects to an external (SSO) login page rather than a proper landing page
-- end log --
[harvester error] Failed to obtain or process metadata from remote service URL https://portal.clarin.ivdnt.org/autocorp/ for corpus-frontend
[harvester info] Harvesting remote service URL https://opensonar.ivdnt.org/ for corpus-frontend: codemetapy --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl -O "/tmp/codemeta-harvester.cache//tmp/corpus-frontend.codemeta.json" "/tmp/out/corpus-frontend.codemeta.json" "https://opensonar.ivdnt.org/"
[harvester info] Harvesting remote service URL https://brievenalsbuit.ivdnt.org for corpus-frontend: codemetapy --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl -O "/tmp/codemeta-harvester.cache//tmp/corpus-frontend.codemeta.json" "/tmp/out/corpus-frontend.codemeta.json" "https://brievenalsbuit.ivdnt.org"
[harvester info] Harvesting remote service URL https://chn.ivdnt.org/ for corpus-frontend: codemetapy --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl -O "/tmp/codemeta-harvester.cache//tmp/corpus-frontend.codemeta.json" "/tmp/out/corpus-frontend.codemeta.json" "https://chn.ivdnt.org/"
[harvester info] <-- Finished processing corpus-frontend (https://github.com/instituutnederlandsetaal/corpus-frontend) [Sat Feb 7 03:07:14 UTC 2026]