The projects goal is to study the behavior and performance for a use case where a web application involves some moderate to heavy intensive task (CPU and/or IO). This computational task is embedded in an autonomous component that can be called a micro-service. For the study, the computational task is a text extraction from a PDF file.
During metrics, there were three variables to modify: the number of workers, the server chunk size and the client chunk size. Too small chunk size is heavily impacting the performance. The server is using the same chunk size for all the streaming connections, that’s why it influences more to the results than the client's one. It is possible differentiate it (use one for the streams to client, and another to workers), but as experiment is showing, just fixing it to big enough value is more than sufficient to provide good performance.
The minimum value for the first setup is equal to 2.45077 seconds, and the minimum achieved with one worker is equal to 2.92923 seconds. The difference is not too critical if we have only one worker. When we add a second worker, it reduces the time of file processing only if chunk size is too small. For all setups having at least 2 workers, the minimum time obtained is equal to 3.9174 seconds, and this is already a significant impact to the performance. The reason for this is a mutex set to guarantee atomic modification of the variable, used by the server to keep track of the last worker used. Further complementary workers do not change the performance at all, but however makes the system more fault-tolerant.