搜索结果:找到“Schema Matching”相关结果594条
排序: 按相关 按相关 按时间降序
  • 【期刊】 Argumentation-based schema matching for multiple digital libraries

    刊名:Online Information Review 作者:Quan, Tho Thanh ; Luong, Xuan H. ; Nguyen, Thanh C. ; Cheung, Hui Siu 关键词:Digital library ; Information retrieval ; Argumentation ; Schema matching 年份:2015
    摘要:Purpose - Most digital libraries (DL) are now available online. They also provide the Z39.50 standard protocol which allows computer-based systems to effectively retrieve information stored in the DLs. The major difficulty lies in inconsistency between database schemas of multiple DLs. The purpose of this paper is to present a system known as Argumentation-based Digital Library Search (ADLSearch), which facilitates information retrieval across multiple DLs. Design/methodology/approach - The proposed approach is based on argumentation theory for schema matching reconciliation from multiple schema matching algorithms. In addition, a distributed architecture is proposed for the ADLSearch system for information retrieval from multiple DLs. Findings - Initial performance results are promising. First, schema matching can improve the retrieval performance on DLs, as compared to the baseline technique. Subsequently, argumentation-based retrieval can yield better matching accuracy and retrieval efficiency than individual schema matching algorithms. Research limitations/implications - The work discussed in this paper has been implemented as a prototype supporting scholarly retrieval from about 800 DLs over the world. However, due to complexity of argumentation algorithm, the process of adding new DLs to the system cannot be performed in a real-time manner. Originality/value - In this paper, an argumentation-based approach is proposed for reconciling the conflicts from multiple schema matching algorithms in the context of information retrieval from multiple DL. Moreover, the proposed approach can also be applied for similar applications which require automatic mapping from multiple database schemas.
  • 【期刊】 Variable linkage for multimedia metadata schema matching

    摘要:Today there are many media sharing applications that use diverse metadata formats to describe media resources. This leads to interoperability issues in cataloguing, searching and annotation. This situation poses schema matching algorithms in the eye of the storm of metadata interoperability. In this paper we present two different solutions for multimedia metadata schema matching using variable linkage algorithms. These methods consist in directly comparing the data values stored in the different metadata variables, allowing to overcome the inherent limitations of schema-level matching approaches. We show the feasibility of these methods through some experiments with real metadata information extracted from the image hosting websites Deviantart, Flickr and Picasa.
  • 【期刊】 Effect of thesaurus size on schema matching quality

    摘要:Thesaurus is used in many Information Retrieval (IR) applications such as data integration, data warehousing, semantic query processing and schema matching. Schema matching or mapping is one of the most important basic steps in data integration. It is the process of identifying the semantic correspondence or equivalent between two or more schemas. Considering the fact of the existence of many thesauri for identical knowledge domain, the quality and the change in the results of schema matching when using different thesauri in specific knowledge field are not predictable. In this research, we studied the effect of thesaurus size on schema matching quality by conducting many experiments using different thesauri. In addition, a new method in calculating the similarity between vectors extracted from thesaurus database is proposed. The method is based on the ratio of individual shared elements to the elements in the compound set of the vectors. Moreover, we explained in details the efficient algorithm used in searching thesaurus database. After describing the experiments, results that show enhancement in the average of the similarity is presented. The completeness, effectiveness, and their harmonic mean measures were calculated to quantify the quality of matching. Experiments on two different thesauri show positive results with average Precision of 35% and a less value in the average of Recall. The effect of thesaurus size on the quality of matching was statically insignificant; however, other factors affecting the output and the exact value of change are still in the focus of our future study. (C) 2014 Elsevier B.V. All rights reserved.
  • 【期刊】 Schema matching based on position of attribute in query statement

    刊名:Knowledge-Based Systems 作者:Ding, Guohui ; Sun, Tianhe 关键词:Schema matching ; Database integration ; Query log ; Ant Colony Optimization ; Attribute position ; Query statement 年份:2015
    摘要:Attribute-level schema matching is a critical step in numerous database applications, such as DataSpaces, Ontology Merging and Schema Integration. There exist many researches on this topic, however, they all ignore evidences about the positions of attributes in query statements, which are crucial to find high-quality matches between schema attributes. In this paper, we propose a novel matching technique based on the positions of attributes appearing in the schema structure of query results. The positions of attributes in query results embody the extent of the importance of an attribute for the user browsing the query results. The core idea of our approach is to collect the statistics about attribute positions from query logs to find correspondences between attributes (matches). Our method works in three phases. The first phase is to design a matrix to record the statistics about attribute positions. Then, we employ two scoring functions to measure the similarities between collected statistics of two schemas to be matched. Finally, we employ a traditional algorithm to find the optimal mapping. Furthermore, our approach can be combined with other existing matchers to obtain more accurate matching results. An experimental study shows that our approach is effective and has good performance. (C) 2014 Elsevier B.V. All rights reserved.
  • 【期刊】 A Novel Method for Instance Level Schema Matching

    刊名:Advanced Materials Research 作者:Han ; Yu Xiang 关键词:Data Integration ; Instance Level ; Schema Matching ; Similarity Matching ; Statistical Methods 年份:2013
    摘要:nformation integration refers to the problem of merging, coalescing and transforming autonomous heterogeneous data sources into a single global homogeneous database and providing a unified view of these data for future query processing purposes. One of the fundamental operations in the integration process is schema matching, which takes two schemas as input and produces a mapping between the attributes of the two schemas that correspond semantically to each other. Matching techniques can be grouped into two broad categories: schema-level matching and instance-level matching. In schema-level matching, we consider only the properties of schema elements, such as names, descriptions, data types, constraints and structures. For each match candidate pair of attributes, the degree of similarity is estimated by a normalized numeric value between 0 and 1. On the other hand, instance-level matching employs information available in the data contents of each table to determine the relationship between any two attributes. In this paper, we propose a statistical model to compare the likeliness of two lists of values under two attributes from separate databases, in order to derive the similarity ratio of the two attributes. Our framework provides efficient procedures to compute the degree ratio using statistical coefficients for both categorical and numeric attributes.
  • 【期刊】 Reducing uncertainty of schema matching via crowdsourcing

    刊名:Proceedings of the VLDB Endowment 作者:Zhang, Chen Jason ; Chen, Lei ; Jagadish, H. V. ; Cao, Chen Caleb 年份:2013
    摘要:
  • 【期刊】 OWL schema matching

    刊名:Journal of the Brazilian Computer Society 作者:Luiz André P. Paes Leme ; Marco A. Casanova ; Karin K. Breitman ; Antonio L. Furtado 关键词:Schema matching ; OWL ; Similarity ; Provenance 机构:1. Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rua Marquês de S. Vicente, 225, Rio de Janeiro, RJ, CEP 22451-900, Brazil ; ; 1. Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rua Marquês de S. Vicente, 225, Rio de Janeiro, RJ, CEP 22451-900, Brazil ; 年份:2010
    摘要:Schema matching is a fundamental issue to many database applications, such as query mediation and data warehousing. It becomes a challenge when different vocabularies are used to refer to the same real-world concepts. In this context, a convenient approach, sometimes called extensional, instance-based, or semantic, is to detect how the same real world objects are represented in different databases and to use the information thus obtained to match the schemas. Additionally, we argue that automatic approaches of schema matching should store provenance data about matchings. This paper describes an instance-based schema matching technique for an OWL dialect and proposes a data model for storing provenance data. The matching technique is based on similarity functions and is backed up by experimental results with real data downloaded from data sources found on the Web.
  • 【期刊】 Research of Information Integration Based on XML Schema Matching

    刊名:Applied Mechanics and Materials 作者:Gou, He Ping ; Jing, Yong Xia ; Zhu, Ya Ling 年份:2014
    摘要:
  • 【期刊】 Enhanced geographically typed semantic schema matching

    刊名:Web Semantics: Science, Services and Agents on the World Wide Web 作者:Jeffrey Partykaa ; Pallabi Parveena ; Latifur Khana ; B. Thuraisinghama ; Shashi Shekharb 关键词:Schema; GIS; Gazetteer; Geocoding; Geotypes; Geosemantics 年份:2011
    摘要:Resolving semantic heterogeneity across distinct data sources remains a highly relevant problem in the GIS domain requiring innovative solutions. Our approach, called GSim, semantically aligns tables from respective GIS databases by first choosing attributes for comparison. We then examine their instances and calculate a similarity value between them called entropy-based distribution (EBD)1 by combining two separate methods. Our primary method discerns the geographic types from instances of compared attributes. If successful, EBD is calculated using only this method. GSim further facilitates geographic type matching by using latlong values to further disambiguate between multiple types of a given instance and applying attribute weighting to quantify the uniqueness of mapped attributes. If geographic type matching is not possible, we then apply a generic schema matching method, independent of the knowledge domain, which employs normalized Google distance. We show the effectiveness of our approach over the traditional approaches across multi-jurisdictional datasets by generating impressive results.
  • 【期刊】 Efficient management of uncertainty in XML schema matching

    摘要:Despite advances in machine learning technologies a schema matching result between two database schemas (e.g., those derived from COMA++) is likely to be imprecise. In particular, numerous instances of ldquopossible mappingsrdquo between the schemas may be derived from the matching result. In this paper, we study problems related to managing possible mappings between two heterogeneous XML schemas. First, we study how to efficiently generate possible mappings for a given schema matching task. While this problem can be solved by existing algorithms, we show how to improve the performance of the solution by using a divide-and-conquer approach. Second, storing and querying a large set of possible mappings can incur large storage and evaluation overhead. For XML schemas, we observe that their possible mappings often exhibit a high degree of overlap. We hence propose a novel data structure, called the block tree, to capture the commonalities among possible mappings. The block tree is useful for representing the possible mappings in a compact manner and can be efficiently generated. Moreover, it facilitates the evaluation of a probabilistic twig query (PTQ), which returns the non-zero probability that a fragment of an XML document matches a given query. For users who are interested only in answers with k-highest probabilities, we also propose the top- k PTQ and present an efficient solution for it. An extensive evaluation on real-world data sets shows that our approaches significantly improve the efficiency of generating, storing, and querying possible mappings.
  • 【期刊】 Multilingual schema matching for Wikipedia infoboxes

    刊名:Proceedings of the VLDB Endowment 作者:Nguyen, Thanh ; Moreira, Viviane ; Nguyen, Huong ; Nguyen, Hoa ; Freire, Juliana 年份:2011
    摘要:
  • 【期刊】 A schema matching system for on-the-fly autonomous data integration

    摘要:Conventional schema matching systems usually choose a randomly ordered set of simple matchers and subsequently combine their individual scores with the help of a composite function ignoring the properties of the individual matchers and disregarding the context of the application of the matchers and therefore, culminate counterintuitive scores, improper matches, wasteful and counterproductive computations. Moreover, as users can hardly validate the processing accuracy until the computation is complete, matching efficiency and quality assurance play a crucial role in autonomous systems. In this paper, we propose a new method, OntoMatch, for schema matching that can avoid wasteful computation by a prudent and objective selection of the ordering of a subset of useful matchers and consequently improves the matching efficiency and accuracy. Experimental results corroborate that successive applications of the simple matchers in OntoMatch monotonically improve the matching score.