环境:Elasticsearch 2.3.2和analysis-ik 1.9.3为例
一开始我下载了个最新版本的ik结果安装后启动提示版本不兼容。
/etc/init.d/elasticsearch start
Starting elasticsearch: Exception in thread “main” java.lang.IllegalArgumentException: Plugin [analysis-ik] is incompatible with Elasticsearch [2.3.2]. Was designed for version [5.0.0]
重新查找后很简单也不用mvn重新编译打包
到https://github.com/medcl/elasticsearch-analysis-ik/releases对应下载一个zip包,解压放到usr/share/elasticsearch/plugins/ik下即可。
配置词库(ik自带搜狗词库)
配置:/usr/share/elasticsearch/plugins/ik/config/ik/IKAnalyzer.cfg.xml
<entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic;custom/sougou.dic</entry>
打开ES_HOME/config/elasticsearch.yml文件
在文件最后加入如下内容:
index:
analysis:
analyzer:
ik:
alias: [ik_analyzer]
type: org.elasticsearch.index.analysis.IkAnalyzerProvider
ik_max_word:
type: ik
use_smart: false
ik_smart:
type: ik
use_smart: true
index.analysis.analyzer.default.type: ik
重启elasticsearch
service elasticsearch restart
测试
http://localhost:9200/随便一个索引名/_analyze?analyzer=ik&pretty=true&text=深圳热销限时促销优惠600元
{
"tokens" : [ {
"token" : "深圳",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
}, {
"token" : "圳",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 1
}, {
"token" : "热销",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 2
}, {
"token" : "热",
"start_offset" : 2,
"end_offset" : 3,
"type" : "CN_WORD",
"position" : 3
}, {
"token" : "销",
"start_offset" : 3,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 4
}, {
"token" : "限时",
"start_offset" : 4,
"end_offset" : 6,
"type" : "CN_WORD",
"position" : 5
}, {
"token" : "促销",
"start_offset" : 6,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 6
}, {
"token" : "促",
"start_offset" : 6,
"end_offset" : 7,
"type" : "CN_WORD",
"position" : 7
}, {
"token" : "销",
"start_offset" : 7,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 8
}, {
"token" : "优惠",
"start_offset" : 8,
"end_offset" : 10,
"type" : "CN_WORD",
"position" : 9
}, {
"token" : "惠",
"start_offset" : 9,
"end_offset" : 10,
"type" : "CN_WORD",
"position" : 10
}, {
"token" : "600",
"start_offset" : 10,
"end_offset" : 13,
"type" : "ARABIC",
"position" : 11
}, {
"token" : "元",
"start_offset" : 13,
"end_offset" : 14,
"type" : "COUNT",
"position" : 12
} ]
}

