文章目录
最终效果
实现的最终效果如下图百度搜索相似,输入词的时候返回提示。同时输入拼音也会有相同的提示效果。
completion使用
安装ik中文分词器:https://github.com/medcl/elasticsearch-analysis-ik
安装拼音分词器:https://github.com/medcl/elasticsearch-analysis-pinyin
定义关键词索引并自定义ik+pinyin分词器,要完成补全搜索,必须要用到特殊的数据类型completion,要汉字拼音都能补全,必须要使用自定义的ik+pinyin分词器。
- PUT suggest
- {
- "settings": {
- "number_of_replicas": 0,
- "number_of_shards": 1,
- "analysis": {
- "analyzer": {
- "ik_pinyin_analyzer": {
- "type": "custom",
- "tokenizer": "ik_max_word",
- "filter": ["my_pinyin", "word_delimiter"]
- }
- },
- "filter": {
- "my_pinyin": {
- "type": "pinyin",
- "first letter": "prefix",
- "padding_char": " "
- }
- }
- }
- },
- "mappings": {
- "suggest": {
- "properties": {
- "keyword": {
- "type": "completion",
- "analyzer": "ik_pinyin_analyzer",
- "fields": {
- "key": {
- "type": "keyword"
- }
- }
- },
- "id": {
- "type": "keyword"
- },
- "createDate": {
- "type": "date",
- "format": "yyyy-MM-dd HH:mm:ss"
- }
- }
- }
- }
- }
初始化部分数据
- POST _bulk/?refresh=true
- { "index": { "_index": "suggest", "_type": "suggest" }}
- { "keyword": "项目"}
- { "index": { "_index": "suggest", "_type": "suggest" }}
- { "keyword": "项目进度"}
- { "index": { "_index": "suggest", "_type": "suggest" }}
- { "keyword": "项目管理"}
- { "index": { "_index": "suggest", "_type": "suggest" }}
- { "keyword": "项目进度及调整 汇总.doc_文档"}
- { "index": { "_index": "suggest", "_type": "suggest" }}
- { "keyword": "项目"}
使用suggest获取搜索补全建议,并对同一词语去重。
- GET /suggest/_search
- {
- "suggest": {
- "my-suggest": {
- "prefix": "项目",
- "completion": {
- "field": "keyword",
- "size": 20,
- "skip_duplicates": true
- }
- }
- }
- }
Java实现搜索补全代码
- /**
- * 获取相关搜索,最多返回9条
- * @param key
- * @return
- */
- public JSONObject getSearchSuggest(String key) {
- CompletionSuggestionBuilder suggestion = SuggestBuilders
- .completionSuggestion("keyword").prefix(key).size(20).skipDuplicates(true);
- SuggestBuilder suggestBuilder = new SuggestBuilder();
- suggestBuilder.addSuggestion("suggest", suggestion);
- SearchResponse response = template.suggest(suggestBuilder, EsConstants.SUGGEST);
- Suggest suggest = response.getSuggest();
- Set<String> keywords = null;
- if (suggest != null) {
- keywords = new HashSet<>();
- List<? extends Suggest.Suggestion.Entry<? extends Suggest.Suggestion.Entry.Option>> entries = suggest.getSuggestion("suggest").getEntries();
- for (Suggest.Suggestion.Entry<? extends Suggest.Suggestion.Entry.Option> entry: entries) {
- for (Suggest.Suggestion.Entry.Option option: entry.getOptions()) {
- /** 最多返回9个推荐,每个长度最大为20 */
- String keyword = option.getText().string();
- if (!StringUtils.isEmpty(keyword) && keyword.length() <= 20) {
- /** 去除输入字段 */
- if (keyword.equals(key)) continue;
- keywords.add(keyword);
- if (keywords.size() >= 9) {
- break;
- }
- }
- }
- }
- }
- return ApiResult.OK(keywords, "获取推荐词组成功");
- }
热门搜索推荐
上面自定义的索引中,通过fields属性专门存储了类型为keyword的字段keyword.key。可以通过统计keyword.key来获取搜索次数最多的句子。下面是java实现方式
- public JSONObject searchHot(Map<String, Object> map) {
- Integer size = 10;
- if (!StringUtils.isEmpty(map.get("size"))) {
- size = (Integer)map.get("size");
- }
- /** 获取最近一个月时间 */
- String preMonth = LocalDateTime.now().minusMonths(1).format(EsConstants.fomatter);
- String now = LocalDateTime.now().format(EsConstants.fomatter);
- /** 统计最近一个月的热门搜索,长度最大10,方便显示 */
- SearchRequestBuilder requestBuilder = template.getClient().prepareSearch(EsConstants.SUGGEST)
- .setQuery(QueryBuilders.rangeQuery("createDate").get(preMonth).lte(now));
- SearchResponse searchResponse = requestBuilder.addAggregation(AggregationBuilders
- .terms("hotSearch").field("keyword.key").size(size)).execute().actionGet();
- Aggregations aggregations = searchResponse.getAggregations();
- Set<String> keywords = null;
- if (aggregations != null) {
- keywords = new HashSet<>();
- Terms hotSearch = aggregations.get("hotSearch");
- List<? extends Terms.Bucket> buckets = hotSearch.getBuckets();
- for (Terms.Bucket bucket: buckets) {
- if (bucket.getKey().toString().length() <= 10) {
- keywords.add((String)bucket.getKey());
- }
- }
- }
- return ApiResult.OK(keywords, "热门搜索获取成功");
- }