ArchAIDE - #Archaeorevolutionisnow
The ArchAIDE project (www.archaide.eu) was funded by the European Union’s Horizon 2020 research and innovation programme and has developed a new app that aims to improve the practice of pottery recognition in archaeology, using the latest automatic image recognition technology. Pottery is of fundamental importance for the comprehension and dating of archaeological contexts, and for understanding the dynamics of production, trade flows, and social interactions. Today, this characterisation and classification of ceramics are carried out manually. The goal of ArchAIDE is to optimise and economise this process, making knowledge accessible wherever archaeologists are working. ArchAIDE supports the classification and interpretation work of archaeologists with an innovative app for tablets and smartphones, designed to be an essential tool for archaeologists. Pottery fragments are photographed, their characteristics sent to a comparative collection (which is meant to show typical pottery types and characteristics, against which pottery to be identified by the user is compared), which activates the image recognition system, resulting in a response with all relevant information linked, and ultimately stored, within a database that allows sharing online. This goal has been implemented through the following practical elements: (a) a digital comparative collection for multiple pottery types has been created, incorporating existing digital collections, digitised paper catalogues and multiple photography campaigns; (b) an automatic-as-possible workflow has been built for accurately digitising paper catalogues and improving the search and retrieval process; (c) a multilingual thesaurus of descriptive pottery terms has been created, mapped to the Getty Art and Architecture Thesaurus and including French, German, Spanish, Catalan, English, Portuguese and Italian; (d) an app has been created using the digital comparative collections to support archaeologists in recognising potsherds during excavation and post-excavation analysis, with an easy-to-use interface and efficient image recognition algorithms for search and retrieval based on either characteristic of shape or decoration; (d) the app can also be used as a tool for learning about pottery identification, either for students or when specialists are not available; (e) once a sherd has been recognised, the app can be used to automatically populate information about the sherd into a virtual assemblage for a site, including the generation of an identity card; (f) the underlying technologies developed for the app have also been implemented as a desktop application, which is a web-based, real-time data visualisation resource, to improve access to archaeological heritage and generate new understanding; (g) the comparative data will be available from an open access archive, ensuring it is available for re-use beyond the ArchAIDE project, contributing to our sustainable, common heritage. The ArchAIDE partnership has representation from the academic and industry-led ICT domains, and the academic and development-led archaeology domains. The archaeological partners of the consortium are the MAPPA Lab at the University of Pisa (coordinator) which has relevant experience in digital applications in Archaeology, and archaeological communication; the Material Culture and Archaeometry research unit at the University of Barcelona, which is focused on promoting studies of material culture, especially on archaeological ceramics, and archaeometric approaches; the Digital Archaeology Laboratory at the University of Cologne, which manages ARACHNE, a highly structured object database in partnership with the German Archaeological Institute (DAI); and the Archaeology Data Service (ADS) at the University of York, which is a world-leading digital data archive for archaeology. The consortium also includes two companies carrying out preventive and development-led archaeological investigations: Baraka Arqueólogos S.L., which has particular expertise in the study of archaeological ceramics, and Elements S.L which is experienced in the application of digital technologies related to ceramic studies. Finally, the consortium’s technical ICT partners are the Visual Computing Lab at CNR-ISTI, an institute of Italian CNR devoted to research on Visual Media and Cultural Heritage; the Deep Learning Lab at the School of Computer Science at Tel Aviv University, which focuses on document analysis, image textual description, and action recognition; and the private software company, Inera s.r.l, which has experience in the field of protocols and web apps. The core of the project stays in Artificial Intelligence. Two Neural Network models, one for appearance-based (decoration), and one for shape-based (profile) similarity were developed. The system currently supports shape-based recognition of Terra Sigillata and Roman Amphorae, and decoration-based recognition of Majolica of Montelupo and Majolica from Barcelona and valencia, as a proof-of-concept for the two main diagnostic criteria used by archaeologists. In order to gather sufficient examples to train the algorithms, partners undertook photographic campaigns of collections in a variety of locations in Europe, along with help from volunteers and students at additional sites. This included more than 25,000 images collected by partners. For the appearance-based recognition (decoration), it was decided to create an algorithm based on combining classic machine learning tools with neural networks that were trained on general image classification tasks. Following a testing phase on a huge dataset of images, the functionality was incorporated into the ArchAIDE app, and classification is now available to archaeologists. As for shape-based recognition, the system was designed to produce “synthetic sherds” (3D shapes available on the computer) to train the system, starting with the pottery profiles that are extracted from the catalogues, in combination with work carried out by partners at CNR-ISTI. Early in the project, CNR-ISTI carried out a variety of experiments around the automated generation of 3D models from traditional archaeological pottery profile drawings, which proved very successful. Not only was it possible to generate a 3D model, it was possible to automatically identify the different diagnostic parts of a ceramic object, allowing objects to be ‘virtually broken’ to create a wide range of synthetic sherds. After being trained and test on Amphorae synthetic sherds, an algorithm was developed by partners at Tel Aviv University based on a standard convolutional neural network (CNN). As ArchAIDE trained on large numbers of classes, there was also experimentation with curriculum training (gradually introducing more classes during the training process) and custom loss functions, to make the network converge. When the models were developed, we tested the recognition performances both on the desktop and mobile devices. The first step was to establish an average accuracy of the app in recognising the genre of the sherd’s decoration. We computed two different accuracies, denominated top-1 and top-5. As for the top-5 accuracy, a result is considered right if the right answer is (anywhere) in the 5 output by the app. As for top-1 accuracy, a result is right if it is the first answer. The average top-5 accuracy is 83.8% and top-1 accuracy is 55.2%. This accuracy, again, resulted as a process of improvement both of the neural network and in the general workflow. Results suggest that higher scores are associated with better predictions. The tests conducted on the shape-based recognition performances are analogous to those on shape-based recognition. As expected, when the shape-based algorithm was first tested with real archaeological pottery sherds for the first time, the system did not operate well. Furthermore, as the network was trained on the sherds generated from profile drawings, the classification was not robust enough to handle small variations that can be seen in sherds observed in practice. Both problems were addressed in the research carried out by partners at Tel Aviv University. The accuracy of the shape-based recognition is not as high as the appearance-based recognition. The average top-5 accuracy is 62.8%, and top-1 accuracy is 36.3%. However, this is an excellent result giving the technical difficulties implied in shape-based recognition, that obligate to develop a brand new approach. The full range of tools described in this paper is freely and openly available. The following features are currently functional within the mobile app (available on Google Play and AppStore): (a) the appearance-based recognition workflow (i.e. the automatic recognition of pottery through the decoration) is complete (b) the shape-based recognition workflow (i.e. the automatic recognition of pottery through the shape) is complete and allows for the correction of the white balance, scaling of the image, the extraction of the profile of a potsherd (c) the information contained into the reference database can be searched and visualised (d) a personal profile can be created within which archaeologists can store data related to assemblages of sherds from their own excavations The following features are currently functional within the web-based, desktop app (http://archaide-desktop.inera.it/): (a) the information contained into the reference database can be searched and visualised, in particular, the geolocation of the origin and the occurrences of each pottery type, interrogatable 3D models of each pottery type, the relationships between different sherds (for instance) types and stamps (b) the appearance-based and the shape-based recognition tool (c) the data-visualisation tools (d) a personal profile can be created within which archaeologists can store data related to assemblages of sherds from their own excavations During the project, exploitation strategies were analysed and discussed, ranging from the commercialisation of the tools, re-use of the technologies in different application domains, commercialisation of the mobile app and many more. The result has led the project to a promising approach: a free ArchAIDE mobile app as a vehicle to commercialise digitised versions of the pottery catalogues. The idea is to show copyright holders the added value of digitising paper catalogues, as they can then be used dynamically in a digital environment like the ArchAIDE app. The work of the copyright holders can then be shown how their work becomes more accessible and more useful than in a traditional paper publication. While the long-term goal for archaeological data is to be open access, for those copyright holders who are not able to do so, a commercial exploitation model will be developed. ArchAIDE app users will be able to buy a specific from the app itself as an “in-app purchase”, the proceeds of which are paid to the copyright holder for the use of their resource. For a user, buying a catalogue means having the possibility to browse and search the types contained in it and display all the available information, including multimedia object and eventually 3D models generated by ArchAIDE team. Whether any copyright holders choose to participate or not, this exploitation model will serve as a strong proof-of-concept for making paper catalogues more useful and accessible, within a commercial environment, but the copyright remains an open challenge. In fact, as the project has progressed, it has become evident that the comparative data necessary to implement the ArchAIDE app must be derived from a variety of sources, each with different advantages and restrictions. For example, the online comparative collection Roman Amphorae: A digital resource, held by the Archaeology Data Service, or an analogue equivalent might be a particular comparative paper catalogue for Majolica of Montelupo. In the first example, while the data creators retain copyright, the comparative collection is already freely and openly disseminated online via a deposit agreement between the copyright holder and the Archaeology Data Service, and can, therefore, be incorporated into the ArchAIDE app without needing to derive further permissions from the copyright holders. This is not the case for the paper catalogue described in the second example, where conversion into a dynamic digital resource was never envisioned. While useful tools to help digitise the paper catalogues necessary to show the technical proof of concept of the ArchAIDE app have been developed by CNR, this does not mean the ArchAIDE project necessarily now holds copyright to the newly digitised, remixed data (although the metadata created as part of this process by the ArchAIDE project can be argued to be new data, for which the project can claim copyright). By showing the potential of digitising paper catalogues in a way that demonstrates how their content can be actively re-used, allows ArchAIDE to now open discussions with publishers and other data providers around the importance of making their resources available in new ways with a concrete example (seeing their data in use within the app), furthering the long-term discourse around making research data open and accessible.

Extra media