Server (on Master) starts client on every slave via SSH. The client becomes a deamon and opens a TCP connection to the server and fetches: Configuration, Worker and Tickets. It creates the commandline and the environment (input files) for the worker and starts it. After it finished it sends the result to the server and fetched a new Task. The communication is done via HTTP. The server is therefore a webserver that can also be accessed with a web browser to get on-line information about the process.
This guide is very brief. You should probably check the Detailed description below, if you run into any trouble.
The first time you start simparex it will create a configuration directory at /etc/simparex or ~/.simparex.
Go there and edit cluster.conf and server.conf. Usually you only have to change the default port in server.conf. The comments in the files will guide you. Further detail can be found below.
You should pick a directory where you want to start simparex. Lets assume it is: ~/runs. Copy the sample worker directory from the tar-ball or from the installation directory to ~/runs/. If you haven't setup differently the following will do:
Edit worker.conf and read the comment in that file. See below for more information.
The testmode will try to check for some possible faults in the configuration.
Start the server with the following command and watch its output.
The results of all tasks will be in SampleWorkerDir/Results/Factors1 and SampleWorkerDir/Results/Factors2
Simparex needs a server.conf and a cluster.conf. The seach path for these files is as follows:
Usually you only have to change the default port in server.conf. If your hostname cannot be detected you need to specify this as well.
Please adapt the SlaveBaseDir, which is the directory on slave where the client and the worker is copied to. (It is created of not there.) There sub directories for each task are created. The Slaves variable is used to specify all machines you want to work on. The hosts can be given by either hostname or ip address. A host can be listed more then once, then the client is started that often (reasonable for multiprocessor machines). Example:
To each of the machine you should be able to perform a ssh login with publickey/privatekey authentication. Please check the ssh documentation for a detailed description. In short version you have to generate a key pair with ssh-keygen -t rsa on you master machine. You can press enter on all questions (i.e. don't give a passphrase for your key). Then copy the file public key to each the slave machine and add it to the authorized keys list. All together
If you need different cluster configurations you can create a copy of cluster.conf with a different name, say allmachines.conf. You can then start
the searchpath for this file is as specified above. (current directory, user cfg, system cfg)
We are now going to setup a working directory. This directory holds all information needed for a run, e.g. the worker configuration, the task description(s) and later the results.
You should pick a directory where to place the working directory(ies) and where you will start simparex from. Lets assume it is: ~/runs. Copy the sample worker directory from the tar-ball or from the installation directory to ~/runs/. If you haven't setup differently the following will do:
In this directory there is a little program(worker) which calculates prime factors. We will use this example to illustrate the configuration. Rename the directory to anything that hits your purpose. For now we will name it Primefactors:
This file must reside in the working directory. It specifies the worker program and some options for the run. Lets look into more detail:
first you can specify the file where to find the task descriptions. Just leave it as it is fow now. For every Plattform (currently just Linux) you can specify the name of the worker and the commandline (Args). Here we use the program primefactors. To illustrate the input and output processing of simparex this program reads one number from the commandline and another one from the standard input. The factors of the first number are printed to stdout and the factors of the second number are written to the file factors.txt Let's see how to setup this. The Args variable is particularly important to understand. With which you construct the rest of the command line for the worker. You can use variables to refere to values inside the task file (see tasks.csv below).
Additionally you have to specify whether some input files have to be generated and which output files contain results you like to collect. In case of large result files you should use the LargeResult specification instead.
This file contains the values for the tasks. The format is a simple ASCII file where fields are seperated with | (bar). Unfortunately there are no comments allowed in this file. The first line contains the header, which names the columns. These names can be used to refer to the value in worker.conf. Every forthcoming line specifies one task. There is no limit for the number of tasks. The following example defines 6 tasks. Each of them must assign a value to all fields (columns)
The testmode tries to check for some possible faults in the configuration.
You see which of your slaves are accessable and whether the configuration files are correct. This test procedure is not complete and can not ensure that everything works of course. Please take a look at the commandline that is printed for the first task.
Usually you start the server with
Please type simarex -h for possible options. The server tells you which configuration files it uses and on which address you can access it with a web browser. For example
could look like:
There you have different levels of detail to observe the progress. Often used options are -p PORT -c FILE where PORT is the port to listen on and the FILE specifies a different cluster file.
The results are collected using the Collect function from worker.conf. In this example the results of all tasks successful tasks will go to Primefactors/Results/Factors1 and Primefactors/Results/Factors2. Old results are backuped.