Although the search that comes with WordPress can search for some things, sometimes the search results are not necessarily satisfactory, and there is no joint search or fuzzy search, so sometimes you have to specify accurate keywords to get the search results. Moreover, the search that comes with WordPress relies heavily on MysqL database queries, which can be time-consuming if the database is large.

It is not difficult to replace the default search of WordPress. People who know the art basically know that site:wzfou.com xxx can search for the content of the specified website. Baidu and Google have launched custom searches, which means site:wzfou.com. com xxx is directly embedded into the site. Users can see the results without jumping to Baidu and Google after clicking search.

However, Baidu and Google custom search rely on the index of the search engine. For new sites with few indexes, sometimes new articles cannot be searched, which affects the user experience. As a result, Elasticsearch, an open source and free self-built search engine, was born. It is a distributed, scalable, real-time search and data analysis engine that can handle full-text search and real-time statistics of structured data.

Improve our on-site search - Baidu, Google custom search and Elasticsearch self-built search

This article will share how Baidu, Google custom search and Elasticsearch self-built search integrate WordPress. More articles about WordPress website building and WP optimization include:

  1. Linux php-fpm optimization experience-php-fpm process takes up large memory and does not release memory problems
  2. WordPress adds Alipay, WeChat reward button production examples and Paypal.me reward link
  3. Five excellent RSS readers recommended - with a collection of major domestic and foreign RSS readers

PS: Updated on September 1, 2019, Friends who don’t want to bother with Baidu and Google custom searches can try this excellent third-party on-site search engine: Use Algolia to add real-time on-site search function to WordPress-Search Higher quality and more accurate content.

PS: Updated on December 2, 2019, If you want to build a more powerful and faster free on-site search, you can also try RediSearch high-performance full-text search engine: RediSearch high-performance full-text search engine Search Engine - Integrate with WordPress to create high-quality search.

1. Baidu on-site search engine

website:

  1. HTTPS://resource.baidu.com/color/wiki/introduce

1.1  Use of Baidu on-site search engine

The first step is to add the website domain name you want to use to the Baidu search engine.

2.1  Use of Google Custom Search

First log in to the official Google Custom Search page, and then click to create a new custom search.

The next step is to set the URL you want to index, name it, etc. (Click to enlarge)

Once created, you can click to get the code.

Google on-site search also provides appearance settings, search results optimization and other functions, which you can adjust according to your needs. (Click to enlarge)

Google Custom Search allows you to pin specific search results, autocomplete, synonyms, and more.

This is the search effect of Google's custom search. Embedding it into a web page may also be affected by the original CSS, so you need to further fine-tune it yourself. (Click to enlarge)

Google Custom Search can also search for images, which is really powerful.

2.2  Google custom search does not display the problem

Due to well-known reasons, Google custom search cannot be displayed properly in China, so how to solve this problem? A feasible method is to use reverse generation (see: Nginx reverse binding), or localize Google custom search related files. It is difficult to implement...

Google Custom Search can actually be hosted, with the following effects:

  1. HTTPS://color.Google.com/color/public URL?appear=011545314673148308753:3 and cannot destroy 6-KFC

3. Elasticsearch self-built search

website:

  1. HTTPS://wuwuwu.elastic.co

3.1  Install Elasticsearch

For Elasticsearch installation, you can basically refer to the official tutorial. Here is a direct reference to the installation method of imququ.com:

Both the virtual machine and the online environment are Ubuntu 14.04.4 LTS, and Elasticsearch uses the latest version. Before everything starts, first check whether the java environment is installed on the machine. If not, you can install it through the following command:

sudo apt-get install openjdk-7-jre-headless

Download the Elasticsearch 2.3.0 compressed package and unzip it:

wget -c https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/zip/elasticsearch/2.3.0/elasticsearch-2.3.0.zip
unzip elasticsearch-2.3.0.zip

Rename the unzipped elasticsearch-2.3.0 directory to ~/es_root (there are no restrictions on the name and location, you can move it to any location you think is appropriate). Elasticsearch does not require installation and can be run directly (note: it cannot be run with the root account):

BASHcd ~/es_root/bin/
chmod a+x elasticsearch
./elasticsearch

If no error message is printed on the screen, the Elasticsearch service has been started successfully. Create a new terminal and verify it with curl:

BASHcurl -XGET http://127.0.0.1:9200/?pretty

{
  "name" : "Melissa Gold",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.3.0",
    "build_hash" : "8371be8d5fe5df7fb9c0516c474d77b9feddd888",
    "build_timestamp" : "2016-03-29T07:54:48Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.0"
  },
  "tagline" : "You Know, for Search"
}

If you see the above information, everything is normal. Otherwise, please find the cause according to the error message on the screen. Although Elasticsearch itself is written in java, it can interact with the outside world through the RESTful interface, which is very convenient.

By default, Elasticsearch's RESTful service can only be accessed by the local machine, which means that the service in the virtual machine cannot be accessed from the host. To facilitate debugging, you can modify the ~/es_root/config/elasticsearch.yml file and add the following two lines:

network.bind_host: "0.0.0.0"
network.publish_host: _non_loopback:ipv4_

But do not configure like this in the online environment, otherwise anyone can modify your data through this interface.

3.2  Install IK Analysis

The word segmenter that comes with Elasticsearch search will roughly separate each Chinese character directly, without segmenting the words according to the vocabulary. In order to process Chinese searches, you also need to install the Chinese word segmentation plug-in. I am using elasticsearch-analysis-ik, which supports custom lexicon.

First, download the elasticsearch-analysis-ik plugin that matches Elasticsearch:

wget -c https://github.com/medcl/elasticsearch-analysis-ik/archive/v1.9.0.zip
unzip v1.9.0.zip

After decompression, go to the plug-in source code directory and compile:

BASHsudo apt-get install maven
cd elasticsearch-analysis-ik-1.9.0
mvn package

If everything goes well, the compiled files can be found in the target/releases/ directory. Unzip it and copy it to the corresponding directory of ~/es_root:

mkdir -p ~/es_root/plugins/ik/
unzip target/releases/elasticsearch-analysis-ik-1.9.0.zip -d ~/es_root/plugins/ik/

The configuration file of elasticsearch-analysis-ik is in the ~/es_root/plugins/ik/config/ik/ directory. Many of them are word lists. You can modify them by opening them directly with a text editor. Remember to save them as utf-8 format.

Now start the Elasticsearch service. If you see a message similar to the following, it means that the IK Analysis plug-in has been installed:

plugins [analysis-ik]

3.3  Configure synonyms

Elasticsearch comes with a synonym filter called synonym. In order to make IK and synonym work at the same time, we need to define a new analyzer, use IK as the tokenizer and synonym as the filter. It sounds complicated, but actually all you need to do is add a section of configuration.

Open the ~/es_root/config/elasticsearch.yml file and add the following configuration:

YAMLindex:
  analysis:
    analyzer:
      ik_syno:
          type: custom
          tokenizer: ik_max_word
          filter: [my_synonym_filter]
      ik_syno_smart:
          type: custom
          tokenizer: ik_smart
          filter: [my_synonym_filter]
    filter:
      my_synonym_filter:
          type: synonym
          synonyms_path: analysis/synonym.txt

The above configuration defines two new analyzers, ik_syno and ik_syno_smart, which correspond to IK's ik_max_word and ik_smart word segmentation strategies respectively. According to the IK documentation, the differences between the two are as follows:

  • ik_max_word: will split the text into the finest granularity and exhaust all possible combinations;
  • ik_smart: will split the text into the coarsest granularity;

Both ik_syno and ik_syno_smart will use synonym filter to implement synonym conversion. In order to facilitate subsequent testing, it is recommended to create the ~/es_root/config/analysis/synonym.txt file, enter some synonyms and save it in utf-8 format. For example:

ua,user-agent,userAgent
js,javascript
谷歌=>google

3.4  Elasticsearch integrates WordPress

WordPress plugin:

1. ElasticPress: https://wordpress.org/plugins/elasticpress/

2. WP Search with Elasticsearch: https://wordpress.org/plugins/db-search-with-elasticsearch/

These two Elasticsearch WordPress plug-ins can help us integrate Elasticsearch search into WordPress. First, activate the plug-in, and then go to the plug-in settings page to fill in the Elasticsearch server-related information. (Click to enlarge)

Then you can synchronize WordPress articles and pages to the Elasticsearch server and start indexing.

4. Summary

Baidu on-site search engine is suitable for websites that do not use HTTPS and have a large Baidu index. It is very disadvantageous for new sites and websites that use HTTPS to use Baidu on-site search engine. Google custom search is suitable for foreign friends and domestic friends. Stop fussing.

Elasticsearch's self-built search is a very good tool with powerful functions. The full-text search for WordPress is really a "little test". Elasticsearch can index, search, sort, and filter documents, and it can perform complex full-text searches.

Leave a Reply