Using JavaScript Script for Complex Query Rewriting
INFINI Gateway
2022-04-19

There is a business requirement:

How can we support cross-cluster search in the gateway? I want to achieve the following: the input search request is lp:9200/index1/_search, and this index is spread across 3 clusters. Can the gateway be modified to lp:9200/cluster01:index1,cluster02,index1,cluster03:index1/_search? The index has over a hundred names, which may not be just “app”; there can be multiple indexes together.

Although the content_regex_replace filter provided by the INFINI Gateway can achieve character-based regex replacement, it is not suitable for this requirement with variable substitution. Is there any other way to achieve this?

Using Script Filter #

Of course, there is. For the mentioned requirement, theoretically, we only need to match the index index1 and replace it with cluster01:index1,cluster02,index1,cluster03:index1.

The answer is to use a custom script. No matter how complex the business logic is, it can be implemented through a custom script. If one line of script is not enough, use two lines.

We can achieve this functionality using the JavaScript filter provided by the INFINI Gateway. Let’s dive into the details.

Defining the Script #

First, create a script file and place it in the scripts subdirectory under the gateway’s data directory, as shown below:

➜  gateway ✗ tree data
data
└── gateway
    └── nodes
        └── c9bpg0ai4h931o4ngs3g
            ├── kvdb
            ├── queue
            ├── scripts
            │   └── index_path_rewrite.js
            └── stats

The content of this script is as follows:

function process(context) {
  var originalPath = context.Get("_ctx.request.path");
  var matches = originalPath.match(/\/?(.*?)\/_search/);
  var indexNames = [];
  if (matches && matches.length > 1) {
    indexNames = matches[1].split(",");
  }
  var resultNames = [];
  var clusterNames = ["cluster01", "cluster02"];
  if (indexNames.length > 0) {
    for (var i = 0; i < indexNames.length; i++) {
      if (indexNames[i].length > 0) {
        for (var j = 0; j < clusterNames.length; j++) {
          resultNames.push(clusterNames[j] + ":" + indexNames[i]);
        }
      }
    }
  }

  if (resultNames.length > 0) {
    var newPath = "/" + resultNames.join(",") + "/_search";
    context.Put("_ctx.request.path", newPath);
  }
}

Similar to regular JavaScript, we define a specific function process to handle the context information in the request. _ctx.request.path is a built-in context variable of the gateway, used to retrieve the request path. We can access it within the script using context.Get("_ctx.request.path").

In the script, we use JavaScript’s regular expression matching and string manipulation to perform character concatenation and obtain the new path in the newPath variable. Finally, we use context.Put("_ctx.request.path",newPath) to update the gateway’s

request path, thereby achieving parameter substitution in the search criteria.

For a list of built-in context variables in the gateway, please visit Request Context.

Configuring the Gateway #

Next, create a gateway configuration and use the javascript filter to call the script, as shown below:

entry:
  - name: my_es_entry
    enabled: true
    router: my_router
    max_concurrency: 10000
    network:
      binding: 0.0.0.0:8000

flow:
  - name: default_flow
    filter:
      - dump:
          context:
            - _ctx.request.path
      - javascript:
          file: index_path_rewrite.js
      - dump:
          context:
            - _ctx.request.path
      - elasticsearch:
          elasticsearch: dev
router:
  - name: my_router
    default_flow: default_flow

elasticsearch:
  - name: dev
    enabled: true
    schema: http
    hosts:
      - 192.168.3.188:9206

In the example above, we use a javascript filter and specify the script file to be loaded as index_path_rewrite.js. We also use two dump filters to output the path information before and after running the script. Finally, we use an elasticsearch filter to forward the request to Elasticsearch for querying.

Starting the Gateway #

Let’s start the gateway and test it:

➜  gateway ✗ ./bin/gateway
   ___   _   _____  __  __    __  _
  / _ \ /_\ /__   \/__\/ / /\ \ \/_\ /\_/\
 / /_\///_\\  / /\/_\  \ \/  \/ //_\\\_ _/
/ /_\\/  _  \/ / //__   \  /\  /  _  \/ \
\____/\_/ \_/\/  \__/    \/  \/\_/ \_/\_/

[GATEWAY] A light-weight, powerful and high-performance elasticsearch gateway.
[GATEWAY] 1.0.0_SNAPSHOT, 2022-04-18 07:11:09, 2023-12-31 10:10:10, 8062c4bc6e57a3fefcce71c0628d2d4141e46953
[04-19 11:41:29] [INF] [app.go:174] initializing gateway.
[04-19 11:41:29] [INF] [app.go:175] using config: /Users/medcl/go/src/infini.sh/gateway/gateway.yml.
[04-19 11:41:29] [INF] [instance.go:72] workspace: /Users/medcl/go/src/infini.sh/gateway/data/gateway/nodes/c9bpg0ai4h931o4ngs3g
[04-19 11:41:29] [INF] [app.go:283] gateway is up and running now.
[04-19 11:41:30] [INF] [api.go:262] api listen at: http://0.0.0.0:2900
[04-19 11:41:30] [INF] [entry.go:312] entry [my_es_entry] listen at: http://0.0.0.0:8000
[04-19 11:41:30] [INF] [module.go:116] all modules are started
[04-19 11:41

:30] [INF] [actions.go:349] elasticsearch [dev] is available

Performing the Test #

Run the following query to verify the results:

curl localhost:8000/abc,efg/_search

You will see the debug information outputted by the dump filter:

---- DUMPING CONTEXT ----
_ctx.request.path  :  /abc,efg/_search
---- DUMPING CONTEXT ----
_ctx.request.path  :  /cluster01:abc,cluster02:abc,cluster01:efg,cluster02:efg/_search

The query has been rewritten according to our requirement. Nice!

Rewriting DSL Query Statements #

Okay, we just modified the index of the search, but what about the DSL query statement? Can we rewrite that too? Of course!

Take a look at the following example:

function process(context) {
  var originalDSL = context.Get("_ctx.request.body");
  if (originalDSL.length > 0) {
    var jsonObj = JSON.parse(originalDSL);
    jsonObj.size = 123;
    jsonObj.aggs = {
      test1: {
        terms: {
          field: "abc",
          size: 10,
        },
      },
    };
    context.Put("_ctx.request.body", JSON.stringify(jsonObj));
  }
}

First, we retrieve the search request and convert it into a JSON object. Then, we can freely modify the query object, save it back, and we’re done.

Let’s test it:

curl -XPOST   localhost:8000/abc,efg/_search -d'{"query":{}}'

Output:

---- DUMPING CONTEXT ----
_ctx.request.path  :  /abc,efg/_search
_ctx.request.body  :  {"query":{}}
[04-19 18:14:24] [INF] [reverseproxy.go:255] elasticsearch [dev] hosts: [] => [192.168.3.188:9206]
---- DUMPING CONTEXT ----
_ctx.request.path  :  /abc,efg/_search
_ctx.request.body  :  {"query":{},"size":123,"aggs":{"test1":{"terms":{"field":"abc","size":10}}}}

Feel like you’ve unlocked a whole new world, right?

Conclusion #

Using the JavaScript script filter in the gateway allows for flexible operations to meet complex business requirements. You can manipulate the request context information using custom scripts, enabling you to achieve various transformations and substitutions.

热门文章
标签
Easysearch x
Gateway x
Console x