暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

【大数据开发】Elasticsearch推荐系统(十)

数据信息化 2020-08-01
259



Elasticsearch

之小说详情页功能开发-分组聚合TopN(十)

Novel detail page function development - Group aggregation topn

简介

本文将介绍如何实现把作家所有的作品的总点击数加起来求和然后再进行排序

页面展示:

01

实施过程

(1)先通过SQL语句计算出作家排行

select sum(a.clickcount) as countTotal , a.author,group_concat(a.name) novel_names from db_novel.novel a where 1=1 group by author order by countTotal desc


(2)把SQL语句转换成ES语句

语句转换网址:http://www.ischoolbar.com/EsParser/


    转换后的ES语句:

{"aggs":{"author":{"terms":{"field":"author","size":10,"order":{"countTotal":"DESC"}},"aggs":{"countTotal":{"sum":{"field":"clickcount"}},"top":{"top_hits":{"size":1}}}}}}


(3)格式化ES语句

格式化语句网址:http://www.bejson.com/


    格式化后的语句:

{
"aggs": {
"author": {
"terms": {
"field": "author",
"size": 10,
"order": {
"countTotal": "DESC"
}
},
"aggs": {
"countTotal": {
"sum": {
"field": "clickcount"
}
},
"top": {
"top_hits": {
"size": 1
}
}
}
}
}
}


(4)创建索引[novel_author_countsort]

    拷贝之前创建的novel索引脚本,直接修改author字段的类型为[keyword]

PUT novel_author_countsort
{
"mappings":{
"doc" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},
"@version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"author" : {
"type" : "keyword"
},
"category" : {
"type" : "keyword"
},
"clickcount" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"collect" : {
"type" : "text",
"fields" : {
"pinyin" : {
"type" : "text",
"boost":10.0,
"term_vector":"with_offsets",
"analyzer":"ik_pinyin_analyzer"
}
}
},
"count" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"countrecommend" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"detail" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"id" : {
"type" : "long"
},
"lastchapter" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"lastupdate" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"monthclick" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"monthrecommend" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text",
"fields" : {
"pinyin" : {
"type" : "text",
"boost":10.0,
"term_vector":"with_offsets",
"analyzer":"ik_pinyin_analyzer"
}
}
},
"new" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"novelinfo" : {
"type" : "text",
"fields" : {
"pinyin" : {
"type" : "text",
"boost":10.0,
"term_vector":"with_offsets",
"analyzer":"ik_pinyin_analyzer"
}
}
},
"picurl" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"status" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"weekclick" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"weekrecommend" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}


(5)给索引[novel_author_countsort]加载数据

 POST _reindex
{
"source":{
"index":"novel"
},
"dest":{
"index":"novel_author_countsort"
}
}


(6)测试

    基于之前SQL语句生成的ES脚本进行测试

GET novel_author_countsort/_search
{
"size": 0,
"aggs": {
"author": {
"terms": {
"field": "author",
"size": 10, //前10条
"order": {
"countTotal": "DESC"
}
},
"aggs": {
"countTotal": {
"sum": {
"field": "clickcount"
}
},
"top": {
"top_hits": {
"size": 1
}
}
}
}
}
}


(7)代码开发

    @Override
public List<TopDetail> getTopAuthor() {
SearchRequestBuilder builder = EsClientUtils.getClient().prepareSearch(CommonUtil.index_novel_countsort);
TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders
.terms("authorAgg")
.field("author")
.order(BucketOrder.aggregation("clicksum",false))
.subAggregation(
AggregationBuilders
.sum("clicksum")
.field("clickcount")
.subAggregation(AggregationBuilders.topHits("top"))
);

扫描二维码 关注我们

微信号 : BIGDT_IN

文章转载自数据信息化,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论