酒店数据全文检索练习(1)-关键词查询与结果条件过滤

前言

通过前面对Elasticsearch的学习,我们用一个综合的例子来进行练习,项目源文件(包含后台和前端代码,但是es查询部分需要自己写,可以跟着本篇文档来写):hotel-demo.zip,练习项目中的SQL脚本:tb_hotel.sql

本篇文章是通过学习黑马B站视频而来:黑马旅游案例

项目结构

image-20220120161531460

创建索引库

我们将sql数据导入到 mysql数据库后,配置好数据量连接,然后我们需要将mysql中的酒店数据,在es中建立索引库。这里之前的这篇文章中有介绍elasticsearch-RestClient索引库以及基本文档操作

建立Mapping映射

这里我们直接建立一个单元测试来进行

public class HotelIndexTest {

    private RestHighLevelClient client;

    /**
     * 初始化客户端
     */
    @BeforeEach
    void setup(){
        client = new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://192.168.113.130:9200")
        ));
    }

    /**
     * 创建索引库
     * @throws IOException
     */
    @Test
    void createHotelIndex() throws IOException {
        // 1.创建Request对象
        CreateIndexRequest request = new CreateIndexRequest("hotel");
        // 2.准备请求的参数:DSL语句
        request.source(MAPPING_TEMPLATE, XContentType.JSON);
        // 3.发送请求
        client.indices().create(request, RequestOptions.DEFAULT);
    }

    /**
     * 关闭客户端
     * @throws IOException
     */
    @AfterEach
    void close() throws IOException {
        client.close();
    }
}

maping映射的代码


    public static final String MAPPING_TEMPLATE = "{\n" +
            "  \"mappings\": {\n" +
            "    \"properties\": {\n" +
            "      \"id\": {\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"name\":{\n" +
            "        \"type\": \"text\",\n" +
            "        \"analyzer\": \"ik_max_word\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"address\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"index\": false\n" +
            "      },\n" +
            "      \"price\":{\n" +
            "        \"type\": \"integer\"\n" +
            "      },\n" +
            "      \"score\":{\n" +
            "        \"type\": \"integer\"\n" +
            "      },\n" +
            "      \"brand\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"city\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"starName\":{\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"business\":{\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"location\":{\n" +
            "        \"type\": \"geo_point\"\n" +
            "      },\n" +
            "      \"pic\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"index\": false\n" +
            "      },\n" +
            "      \"all\":{\n" +
            "        \"type\": \"text\",\n" +
            "        \"analyzer\": \"ik_max_word\"\n" +
            "      }\n" +
            "    }\n" +
            "  }\n" +
            "}";
}
将数据批量导入到es中建立索引
@SpringBootTest
public class HotelDocTest {
    private RestHighLevelClient client;

    @Autowired
    private HotelService hotelService;

    /**
     * 初始化客户端
     */
    @BeforeEach
    void setup() {
        client = new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://192.168.113.130:9200")
        ));
    }

    // 批量导入
    @Test
    void testBulk() throws IOException {
        BulkRequest bulkRequest = new BulkRequest();
        List<Hotel> hotelList = hotelService.list();
        hotelList.forEach(item -> {
            HotelDoc hotelDoc = new HotelDoc(item);
            bulkRequest.add(new IndexRequest("hotel")
                    .id(hotelDoc.getId().toString())
                    .source(JSON.toJSONString(hotelDoc), XContentType.JSON));
        });
        client.bulk(bulkRequest, RequestOptions.DEFAULT);
    }

    /**
     * 关闭客户端
     *
     * @throws IOException
     */
    @AfterEach
    void close() throws IOException {
        client.close();
    }
}

这里有一个HotelDoc类,是因为定位字段es中经纬度要用逗号进行分割,这个类就是用来作为es的映射对象的。

@Data
@NoArgsConstructor
public class HotelDoc {
    private Long id;
    private String name;
    private String address;
    private Integer price;
    private Integer score;
    private String brand;
    private String city;
    private String starName;
    private String business;
    private String location;
    private String pic;

    public HotelDoc(Hotel hotel) {
        this.id = hotel.getId();
        this.name = hotel.getName();
        this.address = hotel.getAddress();
        this.price = hotel.getPrice();
        this.score = hotel.getScore();
        this.brand = hotel.getBrand();
        this.city = hotel.getCity();
        this.starName = hotel.getStarName();
        this.business = hotel.getBusiness();
        this.location = hotel.getLatitude() + ", " + hotel.getLongitude();
        this.pic = hotel.getPic();
    }
}

运行testBulk方法后,我们可以去Kibana的dev tools中查询一下

GET /hotel/_search
{
  "query": {
    "match_all": {}
  }
}

返回结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 201,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "hotel",
        "_type" : "_doc",
        "_id" : "36934",
        "_score" : 1.0,
        "_source" : {
          "address" : "静安交通路40号",
          "brand" : "7天酒店",
          "business" : "四川北路商业区",
          "city" : "上海",
          "id" : 36934,
          "location" : "31.251433, 121.47522",
          "name" : "7天连锁酒店(上海宝山路地铁站店)",
          "pic" : "https://m.tuniucdn.com/fb2/t1/G1/M00/3E/40/Cii9EVkyLrKIXo1vAAHgrxo_pUcAALcKQLD688AAeDH564_w200_h200_c1_t0.jpg",
          "price" : 336,
          "score" : 37,
          "starName" : "二钻"
        }
      },
.....省略

es以及kibana的安装请看docker单机与集群部署elasticsearch

搜索文档

启动项目,访问http://localhost:8089

image-20220120193132740

现在页面上的数据都是假数据,写死在前端代码的,我们打开控制台,点击搜索按钮,看看请求参数是怎样的

image-20220120193342361

image-20220120193430780

我们把筛选条件也选上试试

image-20220120193814423

我们根据这个来创建请求对象

@Data
public class RequestParams {
    /**
     * 关键词
     */
    private String key;

    /**
     * 当前页
     */
    private Integer page;

    /**
     * 每页条数
     */
    private Integer size;

    /**
     * 通过什么进行排序
     */
    private String sortBy;

    /**
     * 城市
     */
    private String city;

    /**
     * 星级
     */
    private String starName;

    /**
     * 品牌
     */
    private String brand;

    /**
     * 最低价格
     */
    private Integer minPrice;

    /**
     * 最高价格
     */
    private Integer maxPrice;
}

我们查看前端代码,可以看到,前端需要hotels和total两个字段,我们根据这个构建返回对象

image-20220121095325700

@Data
public class ResponseObject {
    private long total;

    private List<HotelDoc> hotels;
}
关键词查询与分页

我们一步步的来,先实现一个简单的功能:通过关键词查询和分页

首先我们创建RestHighLevelClient的对象,在启动类配置一个bean即可

@Bean
public RestHighLevelClient restHighLevelClient(){
    return new RestHighLevelClient(RestClient.builder(
        HttpHost.create("http://192.168.113.130:9200")
    ));
}

然后我们在IHotelService接口中定义查询方法,并实现他

@Service
@RequiredArgsConstructor
public class HotelService extends ServiceImpl<HotelMapper, Hotel> implements IHotelService {

    private final RestHighLevelClient client;

    @Override
    public ResponseObject list(RequestParams requestParams) {
        SearchRequest request = new SearchRequest("hotel");
        if (StringUtils.isNotEmpty(requestParams.getKey())){
            // 关键词查询
            request.source().query(QueryBuilders.matchQuery("all",requestParams.getKey()));
        }else{
            request.source().query(QueryBuilders.matchAllQuery());
        }
        //分页
        int page = requestParams.getPage();
        int size = requestParams.getSize();
        request.source().from((page - 1) * size).size(size);

        ResponseObject responseObject = new ResponseObject();
        List<HotelDoc> hotelDocList = new ArrayList<>();
        try {
            SearchResponse response = client.search(request, RequestOptions.DEFAULT);
            SearchHits hits = response.getHits();
            long total = hits.getTotalHits().value;
            responseObject.setTotal(total);
            SearchHit[] searchHits = hits.getHits();
            for (SearchHit searchHit : searchHits) {
                HotelDoc hotelDoc = JSONObject.parseObject(searchHit.getSourceAsString(),HotelDoc.class);
                hotelDocList.add(hotelDoc);
            }
            responseObject.setHotels(hotelDocList);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return responseObject;
    }
}

创建HotelController并调用Service层的方法

@RestController
@RequestMapping("/hotel")
@RequiredArgsConstructor
public class HotelController {

    private final IHotelService hotelService;

    @PostMapping("/list")
    public ResponseObject list(@RequestBody RequestParams requestParams){
        return hotelService.list(requestParams);
    }
}

启动项目,我们查看页面效果:

image-20220121100952566

条件筛选过滤

第二步,我们要实现条件筛选过滤

过滤条件包括:

  • 城市(city):精确匹配
  • 品牌(brand):精确匹配
  • 星级(starName):精确匹配
  • 价格(price):范围查找

这里有多个条件,加上关键词全文检索,这里我们使用BooleanQuery

我们修改一下HotelServicelist方法

@Service
@RequiredArgsConstructor
public class HotelService extends ServiceImpl<HotelMapper, Hotel> implements IHotelService {

    private final RestHighLevelClient client;

    @Override
    public ResponseObject list(RequestParams requestParams) {
        SearchRequest request = new SearchRequest("hotel");

        //构建BooleanQuery
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();

        //构建查询语句
        buildBasicQuery(requestParams, boolQuery);

        request.source().query(boolQuery);
        //分页
        int page = requestParams.getPage();
        int size = requestParams.getSize();
        request.source().from((page - 1) * size).size(size);

        ResponseObject responseObject = new ResponseObject();
        List<HotelDoc> hotelDocList = new ArrayList<>();
        try {
            SearchResponse response = client.search(request, RequestOptions.DEFAULT);
            SearchHits hits = response.getHits();
            long total = hits.getTotalHits().value;
            responseObject.setTotal(total);
            SearchHit[] searchHits = hits.getHits();
            for (SearchHit searchHit : searchHits) {
                HotelDoc hotelDoc = JSONObject.parseObject(searchHit.getSourceAsString(), HotelDoc.class);
                hotelDocList.add(hotelDoc);
            }
            responseObject.setHotels(hotelDocList);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return responseObject;
    }

    private void buildBasicQuery(RequestParams requestParams, BoolQueryBuilder boolQuery) {
        // 关键词查询
        if (StringUtils.isNotEmpty(requestParams.getKey())) {
            boolQuery.must(QueryBuilders.matchQuery("all", requestParams.getKey()));
        } else {
            boolQuery.must(QueryBuilders.matchAllQuery());
        }
        //条件过滤
        if (StringUtils.isNotEmpty(requestParams.getCity())){
            boolQuery.filter(QueryBuilders.termQuery("city", requestParams.getCity()));
        }

        if (StringUtils.isNotEmpty(requestParams.getBrand())){
            boolQuery.filter(QueryBuilders.termQuery("brand", requestParams.getBrand()));
        }

        if (StringUtils.isNotEmpty(requestParams.getStarName())){
            boolQuery.filter(QueryBuilders.termQuery("starName", requestParams.getStarName()));
        }

        if (requestParams.getMinPrice() != null && requestParams.getMinPrice() != null){
            boolQuery.filter(QueryBuilders.rangeQuery("price").gte(requestParams.getMinPrice()).lte(requestParams.getMaxPrice()));
        }
    }
}

然后再来看查询效果

image-20220121103727525

前端这里有点小问题,在搜索四星级和五星级的时候搜索不出来内容,其实是前端少了一个级字,我们可以自己改一下

image-20220121104820571

下一节(酒店数据全文检索练习(2)-按距离排序实现离我最近功能)我们将如何实现排序功能,比如按价格排序,并且实现按距离排序,实现离我附近最近的酒店功能。


酒店数据全文检索练习(1)-关键词查询与结果条件过滤
https://www.zhaojun.inkhttps://www.zhaojun.ink/archives/1010
作者
卑微幻想家
发布于
2022-01-21
许可协议