The model of digitising archival lexicographic publications for the web
DOI:
https://doi.org/10.33604/sl.18.35.6Keywords:
entry index, database, digitisation, digitisation model, web lexicography, text and image editingAbstract
The paper presents a model designed to digitise old lexicographic publications from The Miroslav Krleža Institute of Lexicography for online publication (e-bastina.lzmk.hr). It includes twelve archival publications, the oldest of which are the first edition of the Maritime Encyclopedia (1954–1964), the Encyclopedia of the Institute of Lexicography (1955–1964), and the Medical Encyclopedia (1957–1965), while the most recent publication is the Encyclopedia of Croatian Art (1995–1996). These lexicographic publications have not been available in digital form until now, and their digitisation began with the scanning of printed books. Each lexicographic work varies in content, structure, and presentation of graphic supplements. Therefore, it was necessary to design a digitisation and online publication model that would work for different lexicographic publications. The presented model consists of six steps: 1) page scanning and optical character recognition, 2) text and image editing, 3) creation of an alphabetic index, 4) creation of a database, 5) development of a website for displaying previously structured data, and 6) online publication. Each of these steps incorporates multiple processes, which are determined by available technology, human knowledge, and resources. The paper explains, analyses, and illustrates each step with examples from the authors’ lexicographic practice, to ensure that the presented model can also be applied to the digitisation of other lexicographic publications. Another outcome of this paper is the presentation of the functional website called The Encyclopedic Heritage Collection (e-bastina.lzmk.hr), which was developed using this model. This site consists of twelve digitised lexicographic editions and contains 57 volumes available online with searchable entries. The total number of entries in all editions is 95,000. For each search result, the source is displayed, indicating the encyclopedia or lexicon in which the entry is found.
Downloads
Published
Issue
Section
License
LicenseCopyright for papers published in this journal is retained by the authors, with first publication rights granted to the journal (this applies to both print and electronic issue). Papers in the journal are licensed under the Creative Commons: Attribution (CC-BY), which permits users to copy and redistribute the material in any medium or format, as well as to remix, transform and build upon material in educational and other settings, provided that the credit is given to the author and that the original work is properly cited. Complete legal background of license is available at: https://creativecommons.org/licenses/by/4.0/legalcode. It is the author’s responsibility to obtain permission to reproduce material from other sources. They also bear full responsibility in any cases of copyright infringement.