金海, 郝晓丽, 牛保宁. 结合否定关键词的空间关键词查询[J]. 微电子学与计算机, 2021, 38(9): 54-60.
引用本文: 金海, 郝晓丽, 牛保宁. 结合否定关键词的空间关键词查询[J]. 微电子学与计算机, 2021, 38(9): 54-60.
JIN Hai, HAO Xiaoli, NIU Baoning. Combination of negative keywords spatial keyword query[J]. Microelectronics & Computer, 2021, 38(9): 54-60.
Citation: JIN Hai, HAO Xiaoli, NIU Baoning. Combination of negative keywords spatial keyword query[J]. Microelectronics & Computer, 2021, 38(9): 54-60.

结合否定关键词的空间关键词查询

Combination of negative keywords spatial keyword query

  • 摘要: 面向个性化约束的空间关键词查询是数据库查询领域的热点问题,其中快速性和匹配性是衡量此类查询优劣的核心问题.传统空间关键词范围查询无法匹配除地理位置和关键词信息以外的带有个性化约束条件下的查询,且大多数二维空间下的索引结构的构建更新速度和查询效率较低.针对上述问题,提出了一种带否定关键词约束(即用户不喜欢的关键词)的查询模式.采用Geohash字符串表示兴趣点对象,对字符串排序后构建B+树作为二叉树的叶节点,通过二叉树过滤带否定关键词的对象,构建了基于Geohash的混合索引结构BGIB-Tree.在此基础上,依靠Geohash编码的递归性,设计了前缀匹配搜索算法.以区域编码和对象编码前缀匹配为剪枝策略,快速找到满足空间约束的兴趣点,最后在倒排索引中双向搜索即可完成查询.通过与IR-Tree和BIR-Tree对比,在真实数据集上对BGIB-Tree的构建时间与相关参数对查询算法的影响做出验证,实验证明结果表明在索引构建时间上减少30%,算法查询效率提高29%.

     

    Abstract: Spatial keyword query oriented to personalized constraints is a hot issue in the field of database query, in which the rapidity and matching are the core issues to evaluate the merits of such query. The traditional spatial keyword range query is unable to match the query with personalized constraints except geographical location and keyword information, and the construction and update speed and query efficiency of most index structures in two-dimensional space are low. Aiming at these problems, puts forward a constraint with negative keywords (i.e., users don't like keywords) query model, USES the Geohash string representation point of interest object, built after the string sorting B+Tree as a binary Tree leaf nodes, through the binary Tree to filter object with negative keywords, to build a hybrid index structure based on Geohash BGIB-Tree. On this basis, the prefix matching search algorithm is designed based on the recursion of Geohash encoding. The pruning strategy is based on the region encoding and the object encoding prefix matching to quickly find the points of interest that meet the spatial constraints. Finally, the query can be completed by two-way search in the inverted index. By comparing with IR-Tree and BIR-Tree, the influence of BGIB-tree's construction time and related parameters on the query algorithm was verified on the real data set. The experimen tresults proved show that the index construction time was reduced by 30% and the query efficiency of the algorithm was improved by 29%.

     

/

返回文章
返回