Vector Performance Modelling – Vector Processor Architecture
Vector Performance Modelling :
There are two parameters to describe the performance of the vector processors –
- The asymptotic performance or the theoretical peak performance (r to the base ∞).
- Half performance length (n₁/₂).
Theoretical peak performance is the maximum possible rate of computation that can be achieved by the processor and is expressed in FLOPS (Floating Point instructions per second). This parameter can be used to measure the performance of a single vector processor as well as multiple vector processors. For example, the Asymptotic Performance of a single Cray Y-MP processor is 167 MFLOPS and that of an 8-processor system of Cray Y-MP is 2.6 GFLOPS.
The half performance length is as the name says is the vector length for which the performance is half the peak performance. The performance of a vector processor depends on the vector start-up time and the pipeline depth. If these start-up time and pipeline depth keep on increasing, it becomes very difficult to attain peak performance. So it is expected to reach at least half the peak performance or the n₁/₂ value.
Besides these parameters, the basic performance measure for any multiprocessor system is the same that is the speed-up factor. The speed-up factor is given as the ratio of the execution time for one processor to that of the ‘P’ processors. It can also be said as the ratio of the speed of ‘P’ processors executing simultaneously to that of the single processor.
The specialty of this performance parameter is that it considers the execution time and hence all the overhead of the parallel system is already taken into account. A very important point to be considered is that the same program is not to be tested for parallel processors and a single processor. This is because the algorithm to perform a task on a single processor and parallel processors will be different.
Also when comparing the times required to execute the problem in single and parallel processors, the time to be considered on the sequential processors must be the best algorithm time required. Hence we can say the speed-up ratio can be given as –