There is a business requirement:
How can we support cross-cluster search in the gateway? I want to achieve the following: the input search request is
lp:9200/index1/_search
, and this index is spread across 3 clusters. Can the gateway be modified tolp:9200/cluster01:index1,cluster02,index1,cluster03:index1/_search
? The index has over a hundred names, which may not be just “app”; there can be multiple indexes together.
Although the content_regex_replace filter provided by the INFINI Gateway can achieve character-based regex replacement, it is not suitable for this requirement with variable substitution. Is there any other way to achieve this?
Using Script Filter #
Of course, there is. For the mentioned requirement, theoretically, we only need to match the index index1
and replace it with cluster01:index1,cluster02,index1,cluster03:index1
.
The answer is to use a custom script. No matter how complex the business logic is, it can be implemented through a custom script. If one line of script is not enough, use two lines.
We can achieve this functionality using the JavaScript filter provided by the INFINI Gateway. Let’s dive into the details.
Defining the Script #
First, create a script file and place it in the scripts
subdirectory under the gateway’s data directory, as shown below:
➜ gateway ✗ tree data
data
└── gateway
└── nodes
└── c9bpg0ai4h931o4ngs3g
├── kvdb
├── queue
├── scripts
│ └── index_path_rewrite.js
└── stats
The content of this script is as follows:
function process(context) {
var originalPath = context.Get("_ctx.request.path");
var matches = originalPath.match(/\/?(.*?)\/_search/);
var indexNames = [];
if (matches && matches.length > 1) {
indexNames = matches[1].split(",");
}
var resultNames = [];
var clusterNames = ["cluster01", "cluster02"];
if (indexNames.length > 0) {
for (var i = 0; i < indexNames.length; i++) {
if (indexNames[i].length > 0) {
for (var j = 0; j < clusterNames.length; j++) {
resultNames.push(clusterNames[j] + ":" + indexNames[i]);
}
}
}
}
if (resultNames.length > 0) {
var newPath = "/" + resultNames.join(",") + "/_search";
context.Put("_ctx.request.path", newPath);
}
}
Similar to regular JavaScript, we define a specific function process
to handle the context information in the request. _ctx.request.path
is a built-in context variable of the gateway, used to retrieve the request path. We can access it within the script using context.Get("_ctx.request.path")
.
In the script, we use JavaScript’s regular expression matching and string manipulation to perform character concatenation and obtain the new path in the newPath
variable. Finally, we use context.Put("_ctx.request.path",newPath)
to update the gateway’s
request path, thereby achieving parameter substitution in the search criteria.
For a list of built-in context variables in the gateway, please visit Request Context.
Configuring the Gateway #
Next, create a gateway configuration and use the javascript
filter to call the script, as shown below:
entry:
- name: my_es_entry
enabled: true
router: my_router
max_concurrency: 10000
network:
binding: 0.0.0.0:8000
flow:
- name: default_flow
filter:
- dump:
context:
- _ctx.request.path
- javascript:
file: index_path_rewrite.js
- dump:
context:
- _ctx.request.path
- elasticsearch:
elasticsearch: dev
router:
- name: my_router
default_flow: default_flow
elasticsearch:
- name: dev
enabled: true
schema: http
hosts:
- 192.168.3.188:9206
In the example above, we use a javascript
filter and specify the script file to be loaded as index_path_rewrite.js
. We also use two dump
filters to output the path information before and after running the script. Finally, we use an elasticsearch
filter to forward the request to Elasticsearch for querying.
Starting the Gateway #
Let’s start the gateway and test it:
➜ gateway ✗ ./bin/gateway
___ _ _____ __ __ __ _
/ _ \ /_\ /__ \/__\/ / /\ \ \/_\ /\_/\
/ /_\///_\\ / /\/_\ \ \/ \/ //_\\\_ _/
/ /_\\/ _ \/ / //__ \ /\ / _ \/ \
\____/\_/ \_/\/ \__/ \/ \/\_/ \_/\_/
[GATEWAY] A light-weight, powerful and high-performance elasticsearch gateway.
[GATEWAY] 1.0.0_SNAPSHOT, 2022-04-18 07:11:09, 2023-12-31 10:10:10, 8062c4bc6e57a3fefcce71c0628d2d4141e46953
[04-19 11:41:29] [INF] [app.go:174] initializing gateway.
[04-19 11:41:29] [INF] [app.go:175] using config: /Users/medcl/go/src/infini.sh/gateway/gateway.yml.
[04-19 11:41:29] [INF] [instance.go:72] workspace: /Users/medcl/go/src/infini.sh/gateway/data/gateway/nodes/c9bpg0ai4h931o4ngs3g
[04-19 11:41:29] [INF] [app.go:283] gateway is up and running now.
[04-19 11:41:30] [INF] [api.go:262] api listen at: http://0.0.0.0:2900
[04-19 11:41:30] [INF] [entry.go:312] entry [my_es_entry] listen at: http://0.0.0.0:8000
[04-19 11:41:30] [INF] [module.go:116] all modules are started
[04-19 11:41
:30] [INF] [actions.go:349] elasticsearch [dev] is available
Performing the Test #
Run the following query to verify the results:
curl localhost:8000/abc,efg/_search
You will see the debug information outputted by the dump
filter:
---- DUMPING CONTEXT ----
_ctx.request.path : /abc,efg/_search
---- DUMPING CONTEXT ----
_ctx.request.path : /cluster01:abc,cluster02:abc,cluster01:efg,cluster02:efg/_search
The query has been rewritten according to our requirement. Nice!
Rewriting DSL Query Statements #
Okay, we just modified the index of the search, but what about the DSL query statement? Can we rewrite that too? Of course!
Take a look at the following example:
function process(context) {
var originalDSL = context.Get("_ctx.request.body");
if (originalDSL.length > 0) {
var jsonObj = JSON.parse(originalDSL);
jsonObj.size = 123;
jsonObj.aggs = {
test1: {
terms: {
field: "abc",
size: 10,
},
},
};
context.Put("_ctx.request.body", JSON.stringify(jsonObj));
}
}
First, we retrieve the search request and convert it into a JSON object. Then, we can freely modify the query object, save it back, and we’re done.
Let’s test it:
curl -XPOST localhost:8000/abc,efg/_search -d'{"query":{}}'
Output:
---- DUMPING CONTEXT ----
_ctx.request.path : /abc,efg/_search
_ctx.request.body : {"query":{}}
[04-19 18:14:24] [INF] [reverseproxy.go:255] elasticsearch [dev] hosts: [] => [192.168.3.188:9206]
---- DUMPING CONTEXT ----
_ctx.request.path : /abc,efg/_search
_ctx.request.body : {"query":{},"size":123,"aggs":{"test1":{"terms":{"field":"abc","size":10}}}}
Feel like you’ve unlocked a whole new world, right?
Conclusion #
Using the JavaScript script filter in the gateway allows for flexible operations to meet complex business requirements. You can manipulate the request context information using custom scripts, enabling you to achieve various transformations and substitutions.