I. INTRODUCTION
Cultural Heritage (CH) sites such as archaeological remains, natural monuments and museums have multiple dimensions of historical, cultural, natural and academic values. Since CH sites should be preserved and protected, many studies adopted augmented reality (AR) to supplement CH sites in non-invasive ways. Diek and Jung have identified economic, experiential, social, epistemic, cultural, historical and educational values of AR in CH sites [1]. They considered AR as an effectively promising technology that can preserve CH and enhance visitors’ satisfaction while strengthening visitors’ learning experience [1]. There are many examples of employing AR in CH sites [2][3][4][5][6][7]. However, even if AR contents are used, visitors are often offered with limited explanation on-site using pre-defined static information on various assets. In this paper, we propose a mobile AR mashup to enrich AR contents with user-generated multimedia on CH sites such as natural monuments.
Our contributions on this paper are as follows.
-
We propose an extended framework for mobile AR mashup using various recognition/tracking algorithms, mashup content management and mashup UI.
-
We develop an AR mashup content generation method for end-users.
-
We demonstrate feasibility of our approach by identifying a proper configuration setting in mobile environment and deploying on a real natural monument museum.
II. RELATED WORK
The preservation of CH sites including museums, galleries and archaeological remains is important to study past cultures and civilizations. For this reason, AR technology is applied to CH sites to preserve and protect CH sites. Diek and Jung used a stakeholder approach to explore the perceived value of the AR implementation within the museum context [1]. They have identified enhanced user satisfaction and user experience for using AR in CH sites [1]. Pedersen et al. developed the TombSeer software prototype to support embodied interaction to visitors of the Egyptian Tomb of Kitines replica exhibit at the Royal Ontario Museum [2]. This prototype employed head-mounted display (HMD) to present 3D interactive holographic images [2]. Martínez et al. presented TinajAR which is a multi-marker video-based AR edutainment application for showing virtual ceramic pieces and explaining the pottery process through virtual avatars [3]. Boboc et al. proposed a mobile AR application that contains historical information related to the Roman poet Ovid [4]. Guimarães et al. applied AR technology as a form of digital media art to the Caloust Gulbenkian Foundation Garden in Lisbon, Portugal [5]. Voinea et al. developed an AR application to visualize and explore a 3D model of a digital replica of a recognized UNESCO monument [6]. Olesky and Wnuk developed an AR application to display historical photos in the former Jewish district in Warsaw, Poland [7]. They found that such AR application helps facilitate positive attitudes towards a place and enhance multicultural place meaning [7].
Unlike traditional data mashup [8][9], AR technologies are adopted in mashup tools to create and author user-generated/user-participated contents in the real-world. Shin et al. developed a general framework for mobile AR mashup that is consisted of object tracking, context management, content management, visualization and interaction components [10]. Shin et al. also developed RGB-D SLAM based social spatial mashup to create a 3D feature map and link information to the 3D space [11]. Yoon and Woo introduced a concept of context-aware mobile augmented reality (CAMAR) mashup [12]. They later defined and solidified the concept of in-situ AR mashup as, “seamlessly combining additional contextual information to a real-world object to enrich content in one or more senses, where mashup process and its outcome are enhanced with context awareness and visualized with augmented reality for intuitive UI/UX [13]”.
Langlotz et al. developed an in-situ authoring solution to create 3D content using 3D primitives and 2D annotations [14]. Seo et al. proposed a concept of “webizing” AR mashup to connect legacy things with existing Web services [15]. Meawad presented InterAKT as a solution to enhance AR browsing in the real-world with crowdsourced geo-social content [16].
Previous AR systems can be categorized into fiducial marker-based and feature point-based (i.e., non-fiducial marker-based) AR systems. Feature point-based AR systems require pre-modelling of the real space for recognition and tracking capabilities. For a small space, feature point-based AR systems (i.e., AR Core, AR Kit, Vuforia) can be applied to various applications. However, they are not suitable in terms of object recognition and tracking performance for a guidance application covering a large indoor space. On the other hand, since fiducial marker-based AR systems have been developed for a long time, they are fast and stable in terms of object recognition and tracking performance. These systems require specially designed frame-based markers such as Vuforia’s VuMark which can be visually distractive and incompatible. Compared to previous approaches, we use widely accepted QR codes to recognize and track objects of interest. Our approach is scalable to be used in various objects in a large indoor space. Furthermore, we can easily identify objects of interest and author/mashup user-generated multimedia contents with associated QR code identifiers for mobile computing environment (i.e., smartphone and tablet).
III. MOBILE AUGMENTED REALITY MASHUP SCENARIOS
We first introduce our mobile AR mashup scenarios on natural monuments. There are Mashup Maker and Mashup Viewer scenarios for our mobile AR mashup to support end-user mashup as shown in Figure 1. First scenario is Mashup Maker mode that allows users to link web contents to objects in the real world. For this purpose, Mashup Maker mode is composed of target object selection, multimedia content category selection, multimedia content adjustment/modification, and mashup confirmation steps. An output of Mashup Maker mode is user-generated mobile AR content which is stored to a cloud database. For example, consider a visitor is in a natural monument such as Asiatic Black Bear’s habitat. The visitor can select a QR code on a tree in the real world in the target object selection step. Then the visitor can search for appropriate multimedia content category in the multimedia content category selection step using various Web services. In multimedia content adjustment and modification step, the visitor can select recently taken photos of Asiatic Black Bears to create a new mobile AR content anchored to the QR code on the tree, which is stored on a cloud database.
Second scenario is Mashup Viewer mode where the saved mobile AR mashup content is retrieved to be used for further guidance on CH sites and sharing. For example, other visitors can view the same QR code on the tree with their mobile devices. Then, user-generated multimedia AR contents such as photos, videos and handwritings of other visitors are retrieved from the cloud database and presented as additional and enriched AR contents to consume.
IV. FRAMEWORK FOR MOBILE AR MASHUP
Previously, a general framework for mobile AR mashup has been designed to include object tracking, context management, visualization, interaction and content management components [10]. Also, similar concepts of mobile AR mashup [11][12][13] and in-situ authoring [14] have been introduced and developed. Extending this general framework as shown in Figure 2, we integrated object tracking and recognition modules to cover various detection algorithms based on QR codes. Furthermore, we present implementation details on representational elements of the created mobile AR mashup contents.
We define elements that constitute mobile AR mashup contents in detail. The basic elements are mashup target object (MTO), mashup target contents (MTC), and the resulting mashup AR contents (MARC). An MTO represents an object being targeted for mashup that is physically located in the real-world. In our scenarios, we use QR codes attached to real-world objects as mashup targets. A real-world object has multiple attributes including object name, keyword, corresponding QR code ID, location, and description. The MTC represent various types of multimedia information available in Web services and local devices. Each of the MTC is represented with attributes including multimedia category, time, title, raw data and its data type. The MARC are a set of collocated information stored to a cloud database that links to an MTO in the real-world. Each of the resulting MARC includes a QR code ID referencing to an MTO, timestamp, MARC author information and MTC to display when the QR code is recognized.
The resulting mashup content is stored on a cloud database. On the cloud database, different pieces of information are distributed across visitor, exhibition, and mashup contents tables and corresponding manager components as shown in Figure 3. The included information is divided into two parts: basic information tables and mashup content tables. The basic information table includes visitor information that describes visitor profile and exhibition information that describes exhibition items in a museum. The mashup content information includes memo, SNS, photo and visit history information tables. These contents information tables are connected to basic information tables and raw data.
Even though, visitors as producers have explicitly chosen to participate in our AR application by authoring and sharing new AR contents, we share the concern to protect visitors’ personal information. To address this issue, in the AR application, we only collect visitors’ information through their account using our application. This information is never shared. When a visitor creates a new mashup content, we provide options to either publish new AR contents publicly with creator information or anonymously. When the visitor chooses to reveal the creator’s information, only then this information is made public under the user’s explicit consent. Since our application is only deployed to a site with a small number of visitors, we acknowledge that this topic of protecting visitors’ information deserves further studies.
Our mobile AR mashup tool mainly uses QR codes that contain identification information about objects in the real-world. In our work, we used QR codes for extracting identification and geolocation. As shown in Figure 4, QR code is 2D image pattern containing version information such as data, version, formation and position. Based on this structure and information, it is possible to identify the QR code and its 3D position.
QR code is recognized and tracked following the flowchart shown in Figure 5. First, an input image from the camera is obtained and tested for QR code detection. QR code recognition step starts when QR code in the previous frame was not tracked continuously. Once position markers in the QR code are detected, then the orientation of the QR code is found. Otherwise, Optical Flow tracking starts to recover 3D position of the marker based on previous image and homography. Finally, 3D position of the QR code is estimated via pose estimation step.
After, an object is recognized via its corresponding QR code, our proposed mashup tool allows users to view AR contents related to the object and also modify AR contents. For this purpose, 2D/3D rendering components are supported according to the user view for mobile mashup. The 2D renderer is used for presenting texts and images, while 3D renderer is used for showing 3D contents. Furthermore, mashup-related event processing and UI presentation are included in our mashup tool. During mobile mashup, several events such as object detected, object lost, view mode changed and mashup category changed are registered with related logics and GUI. Figure 6 shows our mobile AR mashup UI.
V. IMPLEMENTATION AND VALUATION
We implemented the proposed mobile AR mashup on Android tablets with OpenCV for end-users. The mashup content server is composed of Apache Web service, SQL database and PHP scripts.
As shown in Figure 7, Mashup Viewer mode provides guide information when a QR code is recognized from camera image frame. The Mashup Viewer then provides additional information related to the object detected. A set of related information includes Twitter messages, memos, photos and visiting statistics. When a user clicks on an appropriate icon, the Mashup Viewer provides user-generated contents previously added by other users. Furthermore, the Mashup Maker mode allows the user to add more information about the object. When the user touches the button located in the bottom of the screen, the mashup UI for each mashup multimedia content is launched. Each icon located on the right side represents Memo, Twitter, Photo, Statistics Graph and Exit button as shown in Figure 8. The user can add a memo on the screen by taking a note with a pen or the user can search and add pictures related to the exhibition object from Web services. Similarly, Tweets can be selectively added by the users from Twitter service.
To show feasibility of our mobile AR mashup using QR codes, we evaluated several well-known feature detection algorithms combined with Good Feature (GF), Speed Up Robust Features (SURF), Scale Invariant Feature Transformation (SIFT), Features from Accelerated Segment Test (FAST), and Oriented FAST/Rotated BREIF (ORB). For comparison, we compared performance on PC and mobile Android platforms. GF is fast and extracts features robust on translation of images [17]. SURF is speeded-up robust feature detection algorithm based on hessian calculation [18]. SIFT is a feature detection algorithm that is robust on scale and rotation of images [19]. FAST is fast feature detection based on decision tree [20]. ORB is improved by the combination of FAST detection and BRISK description [21]. We used Optical Flow for detecting motion information between two images [22]. For comparison, we compared performance on PC (Intel Core i7 CPU 3.40GHz, 8GB) and mobile Android platforms (Samsung Galaxy Tab 10.1, 1.4GHz CPU).
Table 1 and Table 2 show performance of recognition and tracking of a QR code in PC and Android platforms. QR code recognition on both PC and Android platforms took 27 to 57 ms. The performance of QR code recognition was comparable in two platforms. However, QR code detection took much more time in Android platform compared to PC. The longest detection time in PC was Moving Average + Optical Flow (SIFT) combination, which took 228 ms. On comparison, the same algorithm took over 27 seconds. So, on mobile platforms such as smartphones and tablets, we are restricted to use a few selected algorithms. For example, Moving Average took 69 ms, Moving Average + Optical Flow (GF) took 584 ms and Moving Average + Optical Flow (FAST) took 1,571 ms, respectively. Our evaluation results suggest that these three algorithms are more appropriate to apply in mobile AR mashup tools.
We also evaluated relationships between QR code sizes and recognizable distances. For this purpose, we prepared different sizes of QR codes ranging from 3 cm to 10 cm. Note that QR codes have the same width and height. We placed QR codes in different distances ranging from 20 cm to 70 cm. As shown in Table 3, larger QR codes in shorter distance were recognized well. However, smaller QR codes from further distance were, not recognized well. We found that the QR code for mobile AR mashup should be big enough to be recognized in indoor/outdoor environments. Our recommendation is that QR codes of at least 8 cm will work well for various distances from 20 cm to 70 cm.
Distance\Size | 3 cm | 5 cm | 8 cm | 10 cm |
---|---|---|---|---|
20 cm | O | O | O | O |
30 cm | O | O | O | O |
50 cm | X | O | O | O |
70 cm | X | X | O | O |
Based on the observation on the performance of the proposed mobile AR mashup, we deployed this mobile AR mashup service to a natural monument museum located in Daejeon, South Korea. In the natural monument museum, we observed and found that visitors were interested in user-generated information over the default contents linked to the exhibition objects. Since the proposed mashup tool allowed users to take a memo, connect images related to the object or to link Twitters, visitors were actively engaged in their museum tours. In the museum’s manager view, our mashup tool was a convenient tool for updating new information to exhibition objects. Usually, default information on objects was fixed or not frequently updated. With end-user mobile AR mashup, additional and user-generated information could be easily updated by visitors. Furthermore, visitors’ experiences on the museum were continuously accumulated and shared among visitors.
VI. CONCLUSION
In this paper, we introduced a mobile AR mashup for CH sites. To support user-generated content mashup based on mobile AR, we presented how mashup elements, QR code recognition/tracking, content management and AR mashup UI are utilized. Based on these components, users are allowed not only to see the default information related to exhibited objects, but also can add and connect to external information sources using Web services. Through our evaluation, we found that the proposed mashup is fast enough to support content mashup on mobile Android platform. Through the deployment and observation over the natural monument museum, we found that the proposed mashup was useful to give visitors engaging chances to add additional and multimedia information to the museum.
In spite of such outcomes of our approach, there are several limitations that should be addressed in further studies. First, we want to improve the overall speed and quality of mobile AR tracking and recognition. Second, a longitudinal study on visitors’ behaviors and content update should be considered. Nonetheless, our proposed mobile AR mashup provided end-users with abilities to create and author user-generated contents in CH sites. As the mobile AR technology becomes more mature to cover digital twins in mixed reality [23] and Web AR [24], we believe our approach can bridge traditional AR to become user-participated and collaborative mobile AR.