This repository contains the docker image for Cloudsuite’s Web Search benchmark.
The Web Search benchmark relies on the Apache Solr search engine framework. The benchmark includes a client machine that simulates real-world clients that send requests to the index nodes. The index nodes contain an index of the text and fields found in a set of crawled websites.
Supported tags and their respective
serverThis builds an image for the Apache Solr index nodes. You may spawn several nodes.
clientThis builds an image with the client node. The client is used to start the benchmark and query the index nodes.
These images are automatically built using the mentioned Dockerfiles available on
To facilitate the communication between the client(s) and the server(s), we build a docker network:
$ docker network create search_network
We will attach the launched containers to this newly created docker network.
To start the server you have to first
pull the server image and then run it. To
pull the server image, use the following command:
$ docker pull cloudsuite/web-search:server
The following command will start the server and forward port 8983 to the host, so that the Apache Solr’s web interface can be accessed from the web browser using the host’s IP address. More information on Apache Solr’s web interface can be found here. The first parameter past to the image indicates the memory allocated for the JAVA process. The pregenerated Solr index occupies 12GB of memory, and therefore we use
12g to avoid disk accesses. The second parameter indicates the number of Solr nodes. Because the index is for a single node only, the aforesaid parameter should be
$ docker run -it --name server --net search_network -p 8983:8983 cloudsuite/web-search:server 12g 1
At the end of the server booting process, the container prints the
server_address of the index node. This address is used in the client container. The
server_address message in the container should look like this (note that the IP address might change):
$ Index Node IP Address: 172.19.0.2
To start a client you have to first
pull the client image and then run it. To
pull the client image, use the following command:
$ docker pull cloudsuite/web-search:client
The following command will start the client node and run the benchmark. The
server_address refers to the IP address, in brackets (e.g., “172.19.0.2”), of the index node that receives the client requests. The four numbers after the server address refer to: the scale, which indicates the number of concurrent clients (50); the ramp-up time in seconds (90), which refers to the time required to warm up the server; the steady-state time in seconds (60), which indicates the time the benchmark is in the steady state; and the rump-down time in seconds (60), which refers to the time to wait before ending the benchmark. Tune these parameters accordingly to stress your target system.
$ docker run -it --name client --net search_network cloudsuite/web-search:client server_address 50 90 60 60
The output results will show on the screen after the benchmark finishes.
The target response time requires that 99% of the requests are serviced within 200ms.
The throughput statistic, operations per second, is shown as:
<responseTimes unit="seconds"> <operation name="GET" r90th="0.500"> <avg>0.034</avg> <max>0.285</max> <sd>0.035</sd> <p90th>0.080</p90th> <passed>true</passed> <p99th>0.143</p99th> </operation> </responseTimes>
This repository contains a 12GB index for a single node. The index was generated by crawling a set of websites with Apache Nutch. It’s possible to generated indexes for Apache Solr that are both larger and for multiple index nodes. More information on how to generate indexes can be found here.
The commands to add multiple index nodes are almost identical to the commands executed in the server image. An index has to be copied to Apache Solr’s core folder, and then the server is started. The only difference is that the new server nodes have to know the address and the port of the first index node. In our example, it should be
8983. Note that we also need to use a different port for the servers, for example
$ bin/solr start -cloud -p 9983 -z server_address:8983 -s /usr/src/solr_cores/ -m 12g
More information about Solr can be found here.