利用php+mysql+sphinx构建敏捷型搜索引擎

十一月 6th, 2009 Posted in PHP, Ubuntu | 阅读次数: 421 次

制作:高进波
时间:2009-11-05
ubuntu下利用php+sphinx构建搜索引擎,在ubuntu 8.04 64位系统下测试通过

下载地址:
wget http://www.coreseek.cn/uploads/csft/3.1/Source/csft-3.1.tar.gz
wget http://www.coreseek.cn/uploads/csft/3.1/Source/mmseg-3.1.tar.gz

安装相关组件:
apt-get install php5-cgi php5-gd php5-mysql mysql-server lighttpd php5-cli libmysql++-dev automake

安装操作

1 先安装libmmseg
tar xvzf mmseg-3.1.tar.gz
cd mmseg-3.1
./configure –prefix=/usr/local/mmseg
make && make install

 

2 安装sphinx
tar xvzf csft-3.1.tar.gz
cd csft-3.1
./configure  –prefix=/usr/local/sphinx –with-mmseg-includes=/usr/local/mmseg/include/mmseg/ –with-mmseg-libs=/usr/local/mmseg/lib/ –with-mmseg –enable-id64
make && make install

 

3.生成字典
cd mmseg-3.1/data/
/usr/local/mmseg/bin/mmseg -u unigram.txt
mv unigram.txt.uni uni.lib
mkdir /usr/local/sphinx/var/dict
cp uni.lib /usr/local/sphinx/var/dict/

 

4.配置
cd /usr/local/sphinx/etc
cp sphinx.conf.dist csft.conf
数据库使用utf8编码,新建test数据库后导入数据
mysql -uroot -p test < example.sql

vi csft.conf
#设置连接数据库的信息
sql_query_pre                   = SET NAMES utf8
sql_query_pre = set character_set_results = ‘utf8′
charset_type = utf-8
ngram_len                               = 1
ngram_chars                     = U+3000..U+2FA1F
charset_table           = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
charset_dictpath        = /usr/local/sphinx/var/dict/

 

5.生成索引测试
/usr/local/sphinx/bin/indexer –all
/usr/local/sphinx/bin/search test

测试中文
添加一条信息到数据库
insert into documents value(’5′,’2′,’9′,’2009-11-06 01:40:45′,’你好,hugwww’,'我爱你,松山湖,gaojinbo’);

/usr/local/sphinx/bin/search 我爱你     

 

6.php接口查询测试
cd /usr/local/sphinx/etc
/usr/local/sphinx/bin/searchd

cd csft-3.1/api
php test.php test

Include the API (it’s located in api/sphinxapi.php) into your own

注:中文能搜索,在终端显示出来是乱码,不影响正常运行

完成!

相关日志:

One Response to “利用php+mysql+sphinx构建敏捷型搜索引擎”

  1. Hotels in Lisbon Says:

    Our Trackback……

    [...]very few websites that happen to be detailed below, from our point of view are undoubtedly well worth checking out[...]………


留下您的脚印