GaLAHaD Train Battery

Python program for training linguistic annotation taggers based on a configuration file and list of datasets. It prepares the resulting trained models for dockerization and adds relevant metadata. It is tagger software agnostic as long as a simple python shell is built around it.

Provided tools & services

GaLAHaD Train Battery - Dockerizer

Type
  • Unknown
Executable name
docker-build
Service Provider

GaLAHaD Train Battery - Trainer

Type
  • Unknown
Executable name
train
Service Provider

Tool suite: GaLAHaD

The following closely related tools are in a tool suite together with GaLAHaD Train Battery:

  • Server Application
  • Software Image
  • Web API
  • Web Application
  • 6 - Late prototype: Technology demonstrated in target setting, end-users adopt it for testing purposes.
  • Active: The project has reached a stable, usable state and is being actively developed.

GaLAHaD 1.2.2

GaLAHaD (Generating Linguistic Annotations for Historical Dutch) allows linguists to compare taggers, tag their own corpora, evaluate the results and export their tagged documents. [view more]
  • Analyzing
  • Annotating
  • Artificial intelligence, export systems
  • Comparing
  • Computational linguistics and philology
  • Converting
  • Enriching
  • Lemmatizing
  • Linguistics
  • Machine Learning
  • Merging
  • POS-Tagging
  • Software for humanities
  • Tagging
  • Textual and linguistic corpora
  • Jvm
  • Linux
  • Node
Created: 2024-05-31
Modified: 2024-08-30
  • 8 - Complete: Technology complete and qualified, released for all end-users in scholarly environments.
  • Active: The project has reached a stable, usable state and is being actively developed.

int-pie 1.0.0

  •   Enrique Manjavacas
  •   Mike Kestemont
  •   Thibault Clerice
The PIE tagger with custom modifications by the Dutch Language Institute (INT). [view more]
  • Analyzing
  • Annotating
  • Artificial intelligence, export systems
  • Computational linguistics and philology
  • Enriching
  • Lemmatizing
  • Linguistics
  • Machine Learning
  • POS-Tagging
  • Tagging
  • Linux
  • Python
Created: 2024-05-31
Modified: 2024-06-05

Citation

You can cite this software using the following citation generated from its metadata:

(2024) GaLAHaD Train Battery 1.0.0 .
  • Instituut voor de Nederlandse taal
.

Logs & Reviews

Name
Automatic software metadata validation report for GaLAHaD Train Battery 1.0.0
Author
  • codemetapy validator using software.ttl
Date
2024-10-12 03:08:21
Review
Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems

Validation of GaLAHaD Train Battery 1.0.0 was successful (score=3/5), but there are some warnings which should be addressed:

1. Info: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)
2. Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)
3. Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)
4. Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)
5. Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)
6. Info: A research activity *SHOULD* be expressed as a category using the TaDiRaH vocabulary (This is missing in the metadata)
Rating
★ ★ ★ ☆ ☆
(log file starts at Sat Oct 12 03:08:19 UTC 2024)

[harvester info] --> Processing galahad-train-battery (https://github.com/INL/galahad-train-battery) [Sat Oct 12 03:08:19 UTC 2024]

[harvester info] Git updating cached clone of https://github.com/INL/galahad-train-battery...

[harvester info] Found release 1.0.0

[harvester info] Using '1.0.0'

[harvester info] Git reference: 1.0.0

[harvester info] Scanning directory /tmp/codemeta-harvester.cache/galahad-train-battery for harvestable resources...

[harvester info] found codemeta.json for galahad-train-battery (md5sum e9d63593c84e419b6d66b56b56631851); **NOTE: this is considered authoritative and most other detection methods will be skipped now!**

[harvester info] Inferring repostatus information from git activity (used only as a fallback if not explicitly provided)...

[harvester info] Inferred repostatus https://www.repostatus.org/#active

[harvester info] Setting group GaLAHaD

[harvester info] Reconciliating: codemetapy  --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl --identifier "galahad-train-battery" --codeRepository "https://github.com/INL/galahad-train-battery" --validate /etc/software.ttl --released --enrich --textv "Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems" -O /tmp/out/galahad-train-battery.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-repostatus.galahad-train-battery.codemeta.json /tmp/codemeta-harvester.cache//tmp/10-jsonld.galahad-train-battery.codemeta.json /tmp/codemeta-harvester.cache//tmp/04-applicationSuite.galahad-train-battery.codemeta.json 

-- begin log --

Passed 3 files/sources but specified 0 input types! Automatically guessing types...

Detected input types: [('/tmp/codemeta-harvester.cache//tmp/99-repostatus.galahad-train-battery.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/10-jsonld.galahad-train-battery.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/04-applicationSuite.galahad-train-battery.codemeta.json', 'json')]

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://tools.clariah.nl/galahad-train-battery

Processing source #1 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-repostatus.galahad-train-battery.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://tools.clariah.nl/galahad-train-battery

[CODEMETA COMPOSITION (https://tools.clariah.nl/galahad-train-battery)] processed 1 new triples, total is now 2

Processing source #2 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/10-jsonld.galahad-train-battery.codemeta.json

    Injected (possibly temporary) URI https://tools.clariah.nl/galahad-train-battery

[CODEMETA COMPOSITION (galahad-train-battery)] processed 51 new triples, total is now 51

Processing source #3 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/04-applicationSuite.galahad-train-battery.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://tools.clariah.nl/galahad-train-battery

[CODEMETA COMPOSITION (galahad-train-battery)] processed 1 new triples, total is now 52

Remapping URI to (possibly) new identifier and version component: https://tools.clariah.nl/galahad-train-battery -> https://tools.clariah.nl/galahad-train-battery/1.0.0

[CODEMETA VALIDATION (galahad-train-battery)] done

[CODEMETA ENRICHMENT (galahad-train-battery)] adding author http://orcid.org/0009-0006-9941-9582 as contributor

VALIDATION https://tools.clariah.nl/galahad-train-battery/1.0.0 #1: Info: Software source code *SHOULD* link to a continuous integration service that builds the software and runs the software's tests (This is missing in the metadata)

VALIDATION https://tools.clariah.nl/galahad-train-battery/1.0.0 #2: Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)

VALIDATION https://tools.clariah.nl/galahad-train-battery/1.0.0 #3: Info: An interface type *SHOULD* be expressed: Software source code should define one or more target products that are the resulting software applications offering specific interfaces (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)

VALIDATION https://tools.clariah.nl/galahad-train-battery/1.0.0 #4: Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)

VALIDATION https://tools.clariah.nl/galahad-train-battery/1.0.0 #5: Info: Reference publications *SHOULD* be expressed, if any (This is missing in the metadata)

VALIDATION https://tools.clariah.nl/galahad-train-battery/1.0.0 #6: Info: A research activity *SHOULD* be expressed as a category using the TaDiRaH vocabulary (This is missing in the metadata)

-- end log --

[harvester info] Output written to /tmp/out/galahad-train-battery.codemeta.json

[harvester info] <-- Finished processing galahad-train-battery (https://github.com/INL/galahad-train-battery) [Sat Oct 12 03:08:21 UTC 2024]

        

Metadata Properties

Version
1.0.0 (release notes)
Interface types
  • Unknown
Source code repository
 https://github.com/INL/galahad-train-battery  Stars are an indicator of the popularity of this project on GitHub
Category
  • Artificial intelligence, export systems
  • Computational linguistics and philology
  • Linguistics
Development Status
  • 6 - Late prototype: Technology demonstrated in target setting, end-users adopt it for testing purposes.
  • Active: The project has reached a stable, usable state and is being actively developed.
Issue Tracker (Support)
https://github.com/INL/galahad-train-battery/issues  The number of open issues on the issue tracker  The number of closes issues on the issue tracker
Documentation
License
Author(s)
Maintainer(s)
Contributor(s)
Producer
Programming Language
  • Python
Runtime Platform
  • Python 3.10.12
Operating System
  • Linux
Metadata validation
★ ★ ★ ☆ ☆
Created
2024-05-31
Last modified
2024-06-04  Last commit (main branch). Gives an indication of project development activity and rough indication of how up-to-date the latest release is.  Number of commits since the last release. Gives an indication of project development activity and rough indication of how up-to-date the latest release is.