Web Search

Pulls on DockerHub Stars on DockerHub

This repository contains the docker image for Cloudsuite’s Web Search benchmark.

The Web Search benchmark relies on the Apache Solr search engine framework. The benchmark includes a client machine that simulates real-world clients that send requests to the index nodes. The index nodes contain an index of the text and fields found in a set of crawled websites.

Using the benchmark


Supported tags and their respective Dockerfile links:

These images are automatically built using the mentioned Dockerfiles available on https://github.com/parsa-epfl/cloudsuite/tree/master/benchmarks/web-search.

Creating a network between the server(s) and the client(s)

To facilitate the communication between the client(s) and the server(s), we build a docker network:

$ docker network create search_network

We will attach the launched containers to this newly created docker network.

Starting the server (Index Node)

To start the server you have to first pull the server image and then run it. To pull the server image, use the following command:

$ docker pull cloudsuite/web-search:server

The following command will start the server and forward port 8983 to the host, so that the Apache Solr’s web interface can be accessed from the web browser using the host’s IP address. More information on Apache Solr’s web interface can be found here. The first parameter past to the image indicates the memory allocated for the JAVA process. The pregenerated Solr index occupies 12GB of memory, and therefore we use 12g to avoid disk accesses. The second parameter indicates the number of Solr nodes. Because the index is for a single node only, the aforesaid parameter should be 1 always.

$ docker run -it --name server --net search_network -p 8983:8983 cloudsuite/web-search:server 12g 1

At the end of the server booting process, the container prints the server_address of the index node. This address is used in the client container. The server_address message in the container should look like this (note that the IP address might change):

$ Index Node IP Address:

Starting the client and running the benchmark

To start a client you have to first pull the client image and then run it. To pull the client image, use the following command:

$ docker pull cloudsuite/web-search:client

The following command will start the client node and run the benchmark. The server_address refers to the IP address, in brackets (e.g., “”), of the index node that receives the client requests. The four numbers after the server address refer to: the scale, which indicates the number of concurrent clients (50); the ramp-up time in seconds (90), which refers to the time required to warm up the server; the steady-state time in seconds (60), which indicates the time the benchmark is in the steady state; and the rump-down time in seconds (60), which refers to the time to wait before ending the benchmark. Tune these parameters accordingly to stress your target system.

$ docker run -it --name client --net search_network cloudsuite/web-search:client server_address 50 90 60 60  

The output results will show on the screen after the benchmark finishes.

Important remarks

	<metric unit="ops/sec">25.133</metric>
	<responseTimes unit="seconds">
   		<operation name="GET" r90th="0.500">

Additional Information

More information about Solr can be found here.