Here you find all tools (i.e. software and software services) developed in the CLARIAH project, as well as some tools from predecessors and sister projects. Our tools are designed for researchers and developers in the Humanities and Social Sciences. Not all tools are suitable for all audiences and not all tools are mature and stable, this information should be clearly indicated for each tool, so you can make an informed judgement whether a tool might be suitable for you.
This list is automatically harvested from the tool producers and providers themselves, and updated daily.
Are you a CLARIAH developer and is your tool not included in the index yet or do you have questions or comments on the metadata? Please read our contribution guidelines
Alpino Webservice 2.4
- Rijksuniversiteit Groningen (backend), Radboud Universiteit Nijmegen (webservice)
- KNAW Humanities Cluster & CLST, Radboud University
Alpino is a dependency parser for Dutch, developed in the context of the PIONIER Project Algorithms for Linguistic Processing, developed by Gertjan van Noord at the University of Groningen. You can upload either tokenised or untokenised files (which will be automatically tokenised for you using ucto), the output will consist of a zip file containing XML files, one for each sentence in the input document. [view more]
- Internet > WWW/HTTP > WSGI > Application
- Text Processing > Linguistic
- dependency parsing
- folia
- linguistics
- nlp
- syntax
Created: 2015-09-08
Modified: 2023-11-01
AlpinoGraph 1.0.5
AlpinoGraph is een tool om syntactisch geannoteerde corpora te doorzoeken. De tool maakt gebruik van AgensGraph. AgensGraph combineert databasetechnologie (PostgreSQL) en Cypher, de standaard zoektaal voor grafen. De zoek-queries die je in AlpinoGraph kunt gebruiken zijn daarom een mix van SQL en Cypher. Daar voegt AlpinoGraph nog enkele extra uitbreidingen aan toe, zoals een eenvoudig maar handig systeem van macro's, en visualisatie van de resultaten. [view more]
- Linguistics
- nwo:ComputationalLinguisticsandPhilology
- Software for humanities
- Structural Analysis
- Alpino
- Cypher
- Dependency parsing
- SPOD: Syntactic profiler of Dutch
- UD: Universal Dependencies
Created: 2020-03-25
Modified: 2024-04-24
Automatic Speech Recognition Service 0.3
An Automatic Speech Recognition Service for a variety of languages, powered by WhisperX [view more]
- Internet > WWW/HTTP > WSGI > Application
- Text Processing > Linguistic
- clam webservice rest nlp computational_linguistics rest
Created: 2024-02-16
Modified: 2024-04-12
Automatic Transcription of Dutch Speech Recordings 0.6.1
- Centre for Language and Speech Technology, Radboud University
This webservice uses automatic speech recognition to provide the transcriptions of recordings spoken in Dutch. You can upload and process only one file per project. For bulk processing and other questions, please contact Henk van den Heuvel at h.vandenheuvel@let.ru.nl. [view more]
- Software for humanities
- Speech Recognizing
- dutch
- nlp
- speech recognition
Created: 2017-04-02
FCS Aggregator 0.1
The Aggregator application is a part of the CLARIN-FCS common federated content search infrastructure. It serves as a user interface to perform queries to CLARIN-resources and display search results. The Aggregator communicates with components called endpoints, which are provided as a service by all centres who participate in the federated content search. Each endpoint provides access to one or more searchable resources. The user can select a specific resource or resources, based on the resource name or on the language, or search through all of them. The content of these resources is searched with the query supplied to the endpoint. The endpoint returns results to this query and the aggregator collects the responses from all the endpoints and displays them to the user. [view more]
- BlackLab
- CLARIN
- corpus search
- FCS 2.0
- Federated Content Search
- Nederlab
Created: 2016-09-11
Modified: 2023-05-10
Brieven als Buit search 3.1.1
Brieven als Buit provided by the Dutch Language Institute in Leiden. [view more]
- corpus
Created: 2014-03-19
Modified: 2024-02-02
Corpus Hedendaags Nederlands 3.1.1
CHN, provided by the Dutch Language Institute in Leiden. [view more]
- corpus
Created: 2014-03-19
Modified: 2024-02-02
Created: 2014-03-19
Modified: 2024-02-02
CLARIAH Tools 1.6.4
This is a web portal where you can find all tools (i.e. software and software services) developed in the CLARIAH project, as well as some tools from predecessors and sister projects. This list is automatically harvested from the tool producers and providers themselves, and updated daily. Our tools are designed for researchers and developers in the Humanities and Social Sciences. Not all tools are suitable for all audiences and not all tools are mature and stable, this information should be clearly indicated for each tool, so you can make an informed judgement whether a tool might be suitable for you. [view more]
- Browsing
- Databases for humanities
- Discovering
- Exploration
- Gathering
- Software for humanities
- codemeta
- harvester
- linked data
- metadata
- rdf
- schema.org
- software metadata
Created: 2022-01-05
Modified: 2024-06-04
Created: 2018
Created: 2019
Created: 2019
Created: 2017
Created: 2016
FLAT: the FoLiA Linguistic Annotation Tool 0.11.5
- KNAW Humanities Cluster & CLST, Radboud University
FLAT is a web-based linguistic annotation environment based around the FoLiA format (https://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm. [view more]
- Text Processing > Linguistic
- annotation
- computational linguistics
- folia
- linguistics
- nlp
Created: 2014-01-02
Modified: 2024-07-05
Piereling 0.4
- Centre for Language and Speech Technology, Radboud University
- KNAW Humanities Cluster & CLST, Radboud University
Piereling can convert a wide variety of document formats to FoLiA XML, and from FoLiA XML to various formats. Data conversions such as these provide the groundwork for Natural Language Processing pipelines. It relies on numerous specialised conversion tools in combination with notable third-party tools such as pandoc. [view more]
- Internet > WWW/HTTP > WSGI > Application
- Text Processing > Linguistic
- webservice nlp computational_linguistics rest folia conversion
Created: 2019-10-18
Modified: 2023-11-01
ForcedAlignment2 0.3.1
Forced Alignment of text and audio files [view more]
- alignment
- speech recognition
Created: 2020-03
Frog Webservice 2.7
- Centre for Language and Speech Technology, Radboud University and KNAW Humanities Cluster
Frog is a suite containing a tokeniser, Part-of-Speech tagger, lemmatiser, morphological analyser, shallow parser, and dependency parser for Dutch. [view more]
- Annotating
- Contextualizing
- Linguistics
- Named Entity Recognition
- POS-Tagging
- Segmenting
- Tagging
- Textual and content analysis
- Tree-Tagging
- clam webservice rest nlp computational_linguistics rest
Created: 2022-02-17
Modified: 2023-12-05
Grapheme to Phoneme converter 0.3.4
Grapheme to Phoneme (G2P) conversion. Input is a list of words (utf-8, one word per line). The G2P will output the best guess for the phonetic transcription per word. The system is trained on existing dictionaries. Please choose a language option. The system is a demo-version --- please refer to CLST for using G2P for long word lists. [view more]
- Internet > WWW/HTTP > WSGI > Application
- Text Processing > Linguistic
- speech
- transcription
Created: 2019-02-25
Modified: 2023-05-12
GaLAHaD 1.2.2
GaLAHaD (Generating Linguistic Annotations for Historical Dutch) allows linguists to compare taggers, tag their own corpora, evaluate the results and export their tagged documents. [view more]
- Analyzing
- Annotating
- Artificial intelligence, export systems
- Comparing
- Computational linguistics and philology
- Converting
- Enriching
- Lemmatizing
- Linguistics
- Machine Learning
- Merging
- POS-Tagging
- Software for humanities
- Tagging
- Textual and linguistic corpora
Created: 2024-05-31
Modified: 2024-08-30
Glem 1.3.1
- Faculty of Philosophy, Theology and Religious Studies and Centre for Language and Speech Technology, Radboud University Nijmegen
GLEM is a lemmatizer for Ancient Greek. [view more]
- Annotating
- Computational linguistics and philology
- Greek and Latin philology and literature
- ancient greek
- greek
- lemma
- lemmatisation
- natural language processing
- nlp
Created: 2017-04-09
Modified: 2023-10-05
Created: 2016-03-07
Modified: 2022-09-16
I-Analyzer 5.3.0
I-analyzer is a tool for exploring corpora (large collections of texts). You can use I-analyzer to find relevant documents, or to make visualisations to understand broader trends in the corpus. The interface is designed to be accessible for users of all skill levels.
I-analyzer is primarily intended for academic research and higher education. We focus on data that is relevant for the humanities, but we are open to datasets that are relevant for other fields. [view more]
- corpus research
- data visualization
- elasticsearch
- natural language processing
- text-mining
Created: 2016-09-01
Modified: 2023-12-08
Ineo - Start using digital humanities resources - Ineo
Ineo lets you search, browse, find and select digital resources for your research in humanities and social sciences. The platform is already fully functional, but is still being filled with resource content. At the end of 2023, it will offer access to many tools, datasets, workflows, standards and educational material. [view more]
Created: 2019-01-16
Modified: 2024-10-08
Created: 2019-01-16
Modified: 2024-10-08
Created: 2021-06-16
Modified: 2023-01-25
CLARIAH Media Suite 6.10
The CLARIAH Media Suite is a research environment in which researchers can search, bookmark, annotate and compare items from a number of cultural heritage collections [view more]
- collection analysis
- cultural heritage
- data portal
- faceted search
- scholerly annotation
- virtual workspace
Created: 2023-11-21
Modified: 2023-11-21
Created: 2020-12-14
Network of Terms GraphQL API
GraphQL API for the Network of Terms, a Search engine for finding terms in terminology sources (such as thesauri, classification systems and reference lists) [view more]
- Identifying
- graphql
- linked-data
- search
Created: 2020-04-17
Network of Terms Reconciliation API
Reconciliation API for the Network of Terms, a Search engine for finding terms in terminology sources (such as thesauri, classification systems and reference lists) [view more]
- Identifying
- graphql
- linked-data
- search
Created: 2020-04-17
PaQu 1.0.5
Met PaQu (Parse & Query) kun je zoeken in syntactisch geannoteerde Nederlandstalige corpora.
PaQu ondersteunt twee manieren van zoeken. Met de eerste, eenvoudige, manier kun je naar woordparen zoeken, met daarbij eventueel hun syntactische relatie. De tweede, ingewikkeldere, manier gebruikt de zoektaal XPath.
In PaQu is een aantal syntactisch geannoteerde corpora standaard beschikbaar. Maar het is ook mogelijk om je eigen teksten aan te bieden. Deze teksten worden dan door de automatische ontleder geanalyseerd, en opgeslagen. Vervolgens kun je dan op dezelfde manier in je eigen teksten zoeken. [view more]
- Linguistics
- nwo:ComputationalLinguisticsandPhilology
- Software for humanities
- Structural Analysis
- Alpino
- Dependency parsing
- SPOD: Syntactic profiler of Dutch
- UD: Universal Dependencies
- XPath
Created: 2014-05-21
Modified: 2024-04-24
Created: 2021-11-18
Modified: 2024-06-18
SHEBANQ v4.2z
Search engine for biblical Hebrew based on the Biblia Hebraica Stuttgartensia (Amstelodamensis) database (formerly known as ETCBC, historically known as WIVU) [view more]
- Annotation
- BHS
- BHSA
- Bible
- Biblia Hebraica
- Biblia Hebraica Stuttgartensia
- Biblia Hebraica Stuttgartensia Amstelodamensis
- Data Science
- ETCBC
- Hebrew
- Hebrew Bible Reader
- Hebrew Bible Research
- Hebrew Bible Search
- Hebrew Online Bible
- Linguistic Queries
- Online Bible Hebrew
- Online Hebrew Bible
- Query
- Text Database
- WIVU
Created: 2017-10-19
Modified: 2022-10-12
Created: 2017-10-19
Modified: 2022-10-12
T-scan 0.10.0
- Utrecht University
T-Scan is an analysis tool for Dutch text, mainly focusing on text complexity. It has been initially conceptualized by Rogier Kraf and Henk Pander Maat. Rogier Kraf also programmed the first versions. From 2012 on, Henk Pander Maat supervised the development of the extended versions of the tool. These versions were programmed by Maarten van Gompel, Ko van der Sloot, Martijn van der Klis, Sheean Spoel and Luka van der Plas. [view more]
- dutch
- feature extraction
- natural language processing
- nlp
- readability
Created: 2012-09-12
Ucto Webservice 2.5.2
- Centre for Language and Speech Technology, Radboud University and KNAW Humanities Cluster
- KNAW Humanities Cluster & CLST, Radboud University
Ucto is a unicode-compliant tokeniser. It takes input in the form of one or more untokenised texts, and subsequently tokenises them. Several languages are supported, but the software is extensible to other languages. [view more]
- Annotating
- Linguistics
- Tagging
- Textual and content analysis
- clam webservice rest nlp computational_linguistics rest
Created: 2022-04-08
Modified: 2024-03-14
Service to tokenize, lemmatize, pos-tag and dependency parse using udpipe 4.10
A rest service for an R / udpipe based tokenizer, lemmatizer, pos-tagger and dependency parser.
See https://bitbucket.org/fryske-akademy/udpipe for (docker) setup.
[view more]
Created: 2020-11-18
Modified: 2023-11-26