mbt

MBT is a memory-based tagger-generator and tagger in one. The tagger-generator part can generate a sequence tagger on the basis of a training set of tagged sequences; the tagger part can tag new sequences. MBT can, for instance, be used to generate part-of-speech taggers or chunkers for natural language processing. It has also been used for named-entity recognition, information extraction in domain-specific texts, and disfluency chunking in transcribed speech.

Provided tools & services

libmbt

Memory-based Tagging Library with API for C++
Type
  • Software Library
Executable name
libmbt

mbt

Memory-based tagger, command-line tool
Type
  • Command-line Application
Executable name
mbt

References

Citation

Please use one of the above reference publications to cite the software, if you want to cite the software directly, you can use the following citation generated from the metadata:

mbt 3.10 .
  • Humanities Cluster
.

Logs & Reviews

Name
Automatic software metadata validation report for mbt 3.10
Author
  • codemetapy validator using software.ttl
Date
2024-10-12 03:11:55
Review
Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems

Validation of mbt 3.10 was successful (score=3/5), but there are some warnings which should be addressed:

1. Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)
2. Info: Reference publications *SHOULD* be expressed, if any (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)
3. Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)
4. Info: The technology readiness level *SHOULD* be expressed (This is missing in the metadata)
Rating
★ ★ ★ ☆ ☆
(log file starts at Sat Oct 12 03:11:50 UTC 2024)

[harvester info] --> Processing mbt (https://github.com/LanguageMachines/mbt) [Sat Oct 12 03:11:50 UTC 2024]

[harvester info] Git updating cached clone of https://github.com/LanguageMachines/mbt...

[harvester info] Found release v3.10

[harvester info] Using 'v3.10'

[harvester info] Git reference: v3.10

[harvester info] Scanning directory /tmp/codemeta-harvester.cache/mbt for harvestable resources...

[harvester info] found codemeta.json for mbt (md5sum c2da9a92d2b5c64e48958056b0d7e1fc); **NOTE: this is considered authoritative and most other detection methods will be skipped now!**

[harvester info] Inferring repostatus information from git activity (used only as a fallback if not explicitly provided)...

[harvester info] Inferred repostatus https://www.repostatus.org/#active

[harvester info] Looking for repostatus information in README.md in master branch...

[harvester info] Looking for repostatus information in README in master branch...

[harvester info] Parsing MAINTAINERS from master branch...

[harvester info] Reconciliating: codemetapy  --baseuri https://tools.clariah.nl --baseuri https://tools.clariah.nl --includecontext --addcontext https://w3id.org/nwo-research-fields --addcontext https://w3id.org/research-technology-readiness-levels --addcontextgraph https://vocabs.dariah.eu/rest/v1/tadirah/data?format=text/turtle --trl --identifier "mbt" --codeRepository "https://github.com/LanguageMachines/mbt" --validate /etc/software.ttl --released --enrich --textv "Please consult the CLARIAH Software Metadata Requirements at https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md for an in-depth explanation of any found problems" -O /tmp/out/mbt.codemeta.json /tmp/codemeta-harvester.cache//tmp/99-repostatus.mbt.codemeta.json /tmp/codemeta-harvester.cache//tmp/10-jsonld.mbt.codemeta.json /tmp/codemeta-harvester.cache//tmp/05-maintainers.mbt.codemeta.json 

-- begin log --

Passed 3 files/sources but specified 0 input types! Automatically guessing types...

Detected input types: [('/tmp/codemeta-harvester.cache//tmp/99-repostatus.mbt.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/10-jsonld.mbt.codemeta.json', 'json'), ('/tmp/codemeta-harvester.cache//tmp/05-maintainers.mbt.codemeta.json', 'json')]

Adding to contextgraph: /tmp/turtle

Initial URI automatically generated, may be overriden later: https://tools.clariah.nl/mbt

Processing source #1 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/99-repostatus.mbt.codemeta.json

    NOTE: Not a valid JSON-LD document, @context missing! Attempting to inject automatically...

    Injected (possibly temporary) URI https://tools.clariah.nl/mbt

[CODEMETA COMPOSITION (https://tools.clariah.nl/mbt)] processed 1 new triples, total is now 2

Processing source #2 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/10-jsonld.mbt.codemeta.json

    Injected (possibly temporary) URI https://tools.clariah.nl/mbt

[CODEMETA CORRECTION (mbt)] automatically converting spdx license URI from https:// to http:///

[CODEMETA COMPOSITION (mbt)] processed 112 new triples, total is now 112

Processing source #3 of 3

Parsing json-ld file from /tmp/codemeta-harvester.cache//tmp/05-maintainers.mbt.codemeta.json

    Found main resource with URI https://tools.clariah.nl/maintainers/snapshot

    Injected (possibly temporary) URI https://tools.clariah.nl/mbt

[CODEMETA COMPOSITION (mbt)] processed 14 new triples, total is now 125

Remapping URI to (possibly) new identifier and version component: https://tools.clariah.nl/mbt -> https://tools.clariah.nl/mbt/3.10

[CODEMETA VALIDATION (mbt)] done

[CODEMETA ENRICHMENT (mbt)] adding author https://tools.clariah.nl/stub/H6d646ee3b512dba5 as contributor

[CODEMETA ENRICHMENT (mbt)] adding author https://orcid.org/0000-0003-2493-656X as contributor

[CODEMETA ENRICHMENT (mbt)] adding author https://tools.clariah.nl/stub/H-14eeb8d325fc8c96 as contributor

[CODEMETA ENRICHMENT (mbt)] adding author https://tools.clariah.nl/stub/H6e56810c72098616 as contributor

VALIDATION https://tools.clariah.nl/mbt/3.10 #1: Warning: Documentation *SHOULD* be expressed (This is missing in the metadata)

VALIDATION https://tools.clariah.nl/mbt/3.10 #2: Info: Reference publications *SHOULD* be expressed, if any (The metadata does express this currently, but something is wrong in the way it is expressed. Is the type/class valid?)

VALIDATION https://tools.clariah.nl/mbt/3.10 #3: Info: The funder *SHOULD* be acknowledged (This is missing in the metadata)

VALIDATION https://tools.clariah.nl/mbt/3.10 #4: Info: The technology readiness level *SHOULD* be expressed (This is missing in the metadata)

-- end log --

[harvester info] Output written to /tmp/out/mbt.codemeta.json

[harvester info] <-- Finished processing mbt (https://github.com/LanguageMachines/mbt) [Sat Oct 12 03:11:55 UTC 2024]

        

Metadata Properties

Version
3.10 (release notes)
Interface types
  • Command-line Application
  • Software Library
Software website
Source code repository
 https://github.com/LanguageMachines/mbt  Stars are an indicator of the popularity of this project on GitHub
Keywords
  • machine learning
  • memory based learning
  • natural language processing
  • nlp
  • tagger
Development Status
  • Active: The project has reached a stable, usable state and is being actively developed.
Issue Tracker (Support)
https://github.com/LanguageMachines/mbt/issues  The number of open issues on the issue tracker  The number of closes issues on the issue tracker
Documentation
License
Author(s)
Maintainer(s)
Contributor(s)
Producer
Programming Language
  • C++
Continuous Integration Tests
https://travis-ci.org/LanguageMachines/mbt
Operating System
  • BSD
  • Linux
  • macOS
Software dependencies
  • timbl
  • ticcutils
  • libxml2
Metadata validation
★ ★ ★ ☆ ☆