|Posted on October 26, 2010 at 9:29 AM|
The last call for BSC granted us with 70,000h to run our code. So we will be able to test the efficiency of the parallelization of our code and also to obtain some results for four-electron harmonium. An unfortunate event (the workstation in Szczecin halted) allowed me to test the performance on 16 cores. The results are very promising for the symmetry-related part of the code, yielding an efficency of 89% in 8 cores. There are some magic numbers of processors which yield the best performances. They correspond to submultiples of 64: 2, 4, 8 and 16. No surprise here, an even distribution of tasks maximizes the peformance. Obviously this is related to the way processors are distributed; there is a nice linear trend which relates the number of tasks (out of 64) per core with the computational task.
Maximum number of tasks assigned to each processor vs. time (s).
The only warning shows for the 16-processor job, which 'only' yields a performance of 76%. But I'm not much worried about that because using 16-processors we are saturating the machine (thus the job competes with system processes) and, in addition, one should notice the task split into 16 processors only takes 15 minutes (actually the same parallelized routine is called 10 times during these 15 minutes, therefore the performance is tested on a 1.5 minutes run, and delay accumulates ten times), which is not a long enough time to obtain reliable data.