Huma-Num provides tools and services to French communities of researchers and engineers in SSH for each step in the research data lifecycle. It also provides research projects with a range of tools to facilitate the interoperability of various types of digital raw data and metadata. More specifically for digital collections, the aim is to foster the exchange and dissemination of metadata, but also of the data themselves via standardized tools and lasting, open formats. The tools developed by Huma-Num are all based on Semantic Web technologies, mainly for their auto-descriptive features and for the enrichment opportunities they provide. Other interoperability technologies complement those tools, such as OAI-PMH. All our resources are therefore fully compatible with the Linked Open Data (LOD). Three services in particular have been developed by Huma-Num to process, store and display research data while making them FAIR and preparing them for re-use and long-term preservation. These services embrace the research data life cycle and are designed to meet the needs arising therefrom:
- Share with NAKALA;
- Show and display with NAKALONA;
- Tag and push with ISIDORE.
These complementary services thus constitute a coherent chain of research data tools. While they interact smoothly with each other, they are also open to external tools using the same technologies.
NAKALA is an interoperable and secure service for depositing all types of data (text files, audio, video, images) in order to share them. Based on Semantic Web technologies, this repository mainly provides three types of services:
- assignation of a PID (Persistent IDentifier) making data and metadata citable;
- permanent data access;
- dissemination of metadata through a Triple Store and OAI-PMH. This allows the separation of data management from data presentation.
NAKALONA is a software package which connects the content management system Omeka (created by the Roy Rosenzweig Center for History and New Media, George Mason University, Virginia, USA) and NAKALA, a service created by Huma-Num. It combines the power of Omeka for editing and displaying digital data and the features of NAKALA’s repository for sharing data and metadata in an interoperable way. The main goal of NAKALONA is to offer the possibility of sharing and displaying the data and metadata already stored inside Nakala while taking advantage of Omeka’s possibilities such as its powerful search engine and extended OAI-PMH feeds. This software package is entirely managed and administered by the Huma-Num team, and provided as a Software As A Service (SAAS).
ISIDORE is a search platform allowing access to digital data of Humanities and Social Sciences. Open to all, it relies on the principles of Web of data and provides free access to data (open access). More than a simple search interface, ISIDORE standardizes and enriches the metadata and data collected thanks to recognized vocabularies in three languages (French, English and Spanish).
One of the objectives is to prevent the loss of data by preparing their long-term preservation. Huma-Num highlights two aspects:
- Documenting the use of appropriate formats, which are the basis of data interoperability, greatly facilitates the archiving process.
- An important point is to make the storage of data independent of the device used to disseminate the data.
Different technologies are provided for cold data (i.e. inactive data that are rarely used or accessed), warm data (i.e. data that are analysed on a fairly frequent basis, but not constantly in play or in motion) and hot data (i.e. data used very frequently and data that administrators perceive to be constantly changing).
For cold data: Backup on tapes.
For cold data, the CC-IN2P3 Datacenter where Huma-Num’s infrastructure is hosted provides a backup on tapes (currently around 700 Tos).
For hot data: NAS’s service
For hot data, high availability is provided with a NAS associated with regular snapshots (currently around 100 Tos).
For warm data: distributed Huma-Num Box system
Huma-Num Box is a distributed file system for warm data. A mesh of distributed storage has been established all over France (currently 9 nodes) using different storage technologies encapsulated. Thus, it is possible to do backup and versioning on any node linked on this logically private network: the software allows complete flexibility in the type and frequency of backups and versioning (currently around 300 Tos).
ShareDocs is a file manager that can be used via a web browser, a WebDAV client or a file synchronization software. Some of its features are comparable to those of tools such as Dropbox or Google Drive, but it has clear advantages concerning the security of data storage.
Huma-Num provides a long-term preservation service based on the CINES facility, which is intended for data with a valuable heritage or scientific value. This is much more than the bit preservation done with the above-mentioned technologies. A long-term preservation project means that data have to be organized in such a way that they can be reused by someone who did not participate in their creation, which presupposes a lot of curation. In addition, data should be expressed in a format accepted by the CINES (see https://facile.cines.fr) and it is necessary to provide additional information to document the context of data production, metadata etc. Huma-Num accompanies this kind of project by acting as the go-between linking data producers, CINES, archivists and other actors.
Provision of Software
Huma-Num also buys licences and can provide access on demand to commercial software for text, image or sound processing, spatial data management and data analysis such as Oxygen, XMLmind XML Editor, Abbyy, R Studio, ArcGIS, etc. See the list of available software here.Tags: exposition, preservation, semantic Web, Services, software, storage