Domain specific software
We are working on CLaaS, CLARIAH as a Service. It will provide Domain Services, like Data & Models (Audio, Video, Text, Images, Structured Data), Transformation (Workflow, Provenance, Curation and Evaluation) and Interaction (Workspace, Execution, CMS, UI & UX).
The domain specific software we develop is specialised enough to be useful for specific groups of researchers and generic enough to support a viable amount of users. We love to share some examples.
OCR and HTR software
We develop and customise software to enhance the OCR output on historic newspapers. The typesetting on those historic newspapers may look like calligraphy, the ink of the typeset may be fading, the column-style layout may pose problems, and advertisements may be identified as articles because in those days they didn’t have any illustrations. The same goes for medieval manuscripts or early modern documents. We adjust Handwritten Text Recognition software to recognise each character despite the unique detailing every individual clerk adds.
Extracting and linking entities
We link entities end-to-end: we extract entities, using customised NLP tools like Named Entity Recognition. And to link these named entities the right way, we develop tools for name disambiguation and word-sense disambiguation in close cooperation with our linguists at the Meertens Institute and digital humanities researchers at DHLab.
Geo-toolkit and fuzzy matching
We also develop a geo-toolkit for all disciplines of history at every spatial geographic level. And last but not least we apply fuzzy matching in linking our data to allow for matches that may be less than 100% perfect when finding correspondences between segments of a text and entries in a database.
Department of Digital Infrastructure builds digital infrastructure to open up VOC archives for digital research
Amsterdam, 21 May 2021 – KNAW Humanities Cluster is part of a consortium that receives 3.8 million euro from the Dutch Research Council for the GLOBALISE project to provi…Read more News
In search of scents lost: new project explores Europe’s smelly heritage
A new, €2.8M international research project, funded by the EU Horizon 2020 programme, will capture the smells of Europe as part of our cultural heritage: ODEUROPA.Read more News