The following closely related tools are in a tool suite together with python-ucto:
You can cite this software using the following citation generated from its metadata:
Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems Validation of python-ucto 0.6.8 was successful (score=4/5), but there are some remarks which you may or may not want to address: 1. Info: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata) 2. Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (This is missing in the metadata) 3. Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata) 4. Info: The funder *SHOULD* be acknowledged (This is missing in the metadata) 5. Info: A research domain *SHOULD* be expressed as a category using the NWO Research Fields vocabulary, if applicable (This is missing in the metadata) 6. Info: A research activity *SHOULD* be expressed as a category using the TaDiRaH vocabulary (This is missing in the metadata)
(log file starts at Sat Oct 12 03:16:06 UTC 2024) [harvester info] --> Processing python-ucto (https://github.com/proycon/python-ucto) [Sat Oct 12 03:16:06 UTC 2024] [harvester info] Git updating cached clone of https://github.com/proycon/python-ucto... [harvester info] Found release v0.6.8 [harvester info] Using 'v0.6.8' [harvester info] Git reference: v0.6.8 [harvester info] Scanning directory /tmp/codemeta-harvester.cache/python-ucto for harvestable resources... [harvester info] found python setup for python-ucto, converting to codemeta [harvester info] Looking for license.... [harvester info] No license file found [harvester info] Getting contributors from git... [harvester info] No git contributors found [harvester info] Getting top contributor from git... [harvester info] Git top contributor will be assigned as author (and maintainer) if none are found in the metadata [harvester info] Extracting last and first commit date from git log.... [harvester info] Date created: 2014-05-21T19:33:30Z+0200, date modified: 2024-09-12T14:10:03Z+0200 [harvester info] Querying Github/GitLab API (https://github.com/proycon/python-ucto) [harvester info] Adding URL for found README: README.rst [harvester info] Found releaseNotes [harvester info] Querying Zenodo API for DOI (access token provided)... [harvester info] Found DOI https://doi.org/10.5281/zenodo.13754037 [harvester info] Converting README.rst to README.md [harvester info] Looking for TRL information in README.md... [harvester info] Looking for repostatus information in README.md... [harvester info] Found repostatus https://www.repostatus.org/#active [harvester info] Looking for continuous integration information in README.md... [harvester info] Looking for documentation links in README.md... [harvester info] Scraping title from https://folia.readthedocs.io/en/latest/ [harvester info] Found documentation at https://folia.readthedocs.io/en/latest/ : "name": "FoLiA: Format for Linguistic Annotation - Documentation and Reference Guide — FoLiA: Format for Linguistic Annotation v2.0 (rev 9.0) documentation", [harvester info] Falling back to git tag (v0.6.8) if no version number is specified... [harvester info] Inferring repostatus information from git activity (used only as a fallback if not explicitly provided)... [harvester info] Inferred repostatus https://www.repostatus.org/#inactive [harvester info] Looking for repostatus information in README.rst in master branch... [harvester info] Found repostatus (master branch) https://www.repostatus.org/#active [harvester info] Setting group Ucto [harvester info] Reconciliating: codemetapy --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl --identifier "python-ucto" --codeRepository "https://github.com/proycon/python-ucto" --validate /etc/software.ttl --released --enrich --textv "Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems" -O /tmp/out/python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-version.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-repostatus.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/90-authors.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/50-documentation.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/43-releasenotes.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/41-readme.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/40-gitapi.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/39-gitdate.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/20-python.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/11-repostatus.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/05-repostatus.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/05-doi.python-ucto.codemeta.json /tmp/codemeta-harvester.cache//tmp/04-applicationSuite.python-ucto.codemeta.json -- begin log -- Passed 13 files/sources but specified 0 input types! Automatically guessing types... Detected input types: [('/tmp/codemeta-harvester.cache//tmp/99-version.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/99-repostatus.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/90-authors.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/50-documentation.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/43-releasenotes.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/41-readme.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/40-gitapi.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/39-gitdate.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/20-python.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/11-repostatus.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/05-repostatus.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/05-doi.python-ucto.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/04-applicationSuite.python-ucto.codemeta.json', 'json')] Adding to contextgraph: /tmp/turtle Initial URI automatically generated, may be overriden later: https://tools.clariah.nl/python-ucto Processing source #1 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-version.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 1 new triples, total is now 2 Processing source #2 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-repostatus.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 1 new triples, total is now 3 Processing source #3 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/90-authors.python-ucto.codemeta.json Found main resource with URI https://tools.clariah.nl/python-ucto.topcontributor/snapshot Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 1 new triples, total is now 3 Processing source #4 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/50-documentation.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 4 new triples, total is now 7 Processing source #5 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/43-releasenotes.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 2 new triples, total is now 9 Processing source #6 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/41-readme.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 1 new triples, total is now 10 Processing source #7 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/40-gitapi.python-ucto.codemeta.json Found main resource with URI https://tools.clariah.nl/python-ucto/snapshot Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 26 new triples, total is now 35 Processing source #8 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/39-gitdate.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] overriding old http://schema.org/dateCreated (2014-05-21T17:28:45Z -> 2014-05-21T19:33:30Z+0200) [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] overriding old http://schema.org/dateModified (2024-09-12T14:01:52Z -> 2024-09-12T14:10:03Z+0200) [CODEMETA COMPOSITION (https://tools.clariah.nl/python-ucto)] processed 2 new triples, total is now 35 Processing source #9 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/20-python.python-ucto.codemeta.json Found main resource with URI https://tools.clariah.nl/python-ucto/0.6.8 Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/author (https://tools.clariah.nl/stub/H-f44c46603679bb6 -> https://tools.clariah.nl/stub/H-12c7010f3f2bdf25) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/description (This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto). -> This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is a regular-expression based, extensible, and advanced tokeniser written in C++ (https://languagemachines.github.io/ucto).) [CODEMETA COMPOSITION (python-ucto)] overriding old https://codemeta.github.io/terms/developmentStatus (https://www.repostatus.org/#inactive -> https://www.repostatus.org/#active) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/keywords (text-processing -> tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/keywords (folia -> tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/keywords (nlp-library -> tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/keywords (tokenizer -> tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/keywords (computational-linguistics -> tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/keywords (python -> tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/keywords (nlp -> tokenizer tokenization tokeniser tokenisation nlp computational_linguistics ucto) [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/version (v0.6.8 -> 0.6.8) [CODEMETA COMPOSITION (python-ucto)] processed 50 new triples, total is now 67 Processing source #10 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/11-repostatus.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (python-ucto)] processed 1 new triples, total is now 67 Processing source #11 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/05-repostatus.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (python-ucto)] processed 1 new triples, total is now 67 Processing source #12 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/05-doi.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (python-ucto)] overriding old http://schema.org/identifier (python-ucto -> ) [CODEMETA COMPOSITION (python-ucto)] processed 5 new triples, total is now 71 Processing source #13 of 13 Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/04-applicationSuite.python-ucto.codemeta.json NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically... Injected (possibly temporary) URI https://tools.clariah.nl/python-ucto [CODEMETA COMPOSITION (https://tools.clariah.nl/stub/H-285daf06b6ba062b)] processed 1 new triples, total is now 72 Remapping URI to (possibly) new identifier and version component: https://tools.clariah.nl/python-ucto -> https://tools.clariah.nl/python-ucto/0.6.8 [CODEMETA VALIDATION (python-ucto)] done [CODEMETA ENRICHMENT (python-ucto)] automatically adding programmingLanguage Python derived from runtimePlatform Python [CODEMETA ENRICHMENT (python-ucto)] adding author https://tools.clariah.nl/person/maarten-van-gompel as contributor [CODEMETA ENRICHMENT (python-ucto)] adding affiliation(s) of first author as producer VALIDATION https://tools.clariah.nl/python-ucto/0.6.8 #1: Info: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata) VALIDATION https://tools.clariah.nl/python-ucto/0.6.8 #2: Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (This is missing in the metadata) VALIDATION https://tools.clariah.nl/python-ucto/0.6.8 #3: Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata) VALIDATION https://tools.clariah.nl/python-ucto/0.6.8 #4: Info: The funder *SHOULD* be acknowledged (This is missing in the metadata) VALIDATION https://tools.clariah.nl/python-ucto/0.6.8 #5: Info: A research domain *SHOULD* be expressed as a category using the NWO Research Fields vocabulary, if applicable (This is missing in the metadata) VALIDATION https://tools.clariah.nl/python-ucto/0.6.8 #6: Info: A research activity *SHOULD* be expressed as a category using the TaDiRaH vocabulary (This is missing in the metadata) -- end log -- [harvester info] Output written to /tmp/out/python-ucto.codemeta.json [harvester info] <-- Finished processing python-ucto (https://github.com/proycon/python-ucto) [Sat Oct 12 03:16:19 UTC 2024]