Abstract : Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of internet traffic. P2P systems have emerged as a popular way to share huge volumes of data. Requirements for widely distributed information systems supporting virtual organizations have given rise to a new category of P2P systems called schema-based. In such systems each peer is a database management system in itself, exposing its own schema. In such settings, the main objective is the efficient search across peer databases by processing each incoming query without overly consuming bandwidth. The usability of these systems depends on effective techniques to find and retrieve data; however, efficient and effective routing of content-based queries is an emerging problem in P2P networks. In this paper, we propose an architecture, based on (super-)peers, and we focus on query routing. Our approach considers that (super-)Peers having similar interests are grouped together for an efficient query routing method. In such groups, called knowledge-super-peers (KSP), super-peers submit queries that are often processed by members of this group. A KSP is a specific super-peer which contains knowledge about: 1. its super-peers and 2. the others super-peers. Knowledge is extracted by using data mining techniques (e.g. decision tree algorithms) starting from queries of peers that transit on the network. The advantage of this distributed knowledge is that, it avoids to making semantic mapping, between heterogeneous data sources owned by (super-)peers, each time the system decides to route query to other (super-)peers. The set of KSP improves the robustness in queries routing mechanism and scalability in P2P Network. Compared with a baseline approach, our proposal shows a better performance with respect to important criteria such as response time, precision and recall.
International Conference on Management of Emergent Digital EcoSystems, Oct 2009, Lyon, France. pp.91-98, 2009, 〈10.1145/1643823.1643841〉
