Skip to main content

Tesseract: High CPU Usage and slow speed, only when running via Supervisord [Resolved]

Problem

pytesseract.image_to_string() takes too much time when I run the script through supervisordd, but executes almost instantaneously when run directly in shell (on the same server and simultaneously with supervisor scripts).
Apart from taking too much time, the processes are also showing high CPU usage.

Time taken by pytesseract.image_to_string() when run via Supervisord: ~30s
Time taken by pytesseract.image_to_string() when run via Bash: 0.1s

This problem only occurs, if there are a lot of processes, executing pytesseract.image_to_string(), being run via supervisord (around 22 instances). If I reduce the number of instances (to around 10), the scripts executed via supervisord also run smoothly.

System Information

OS: Ubuntu 18.04.2 LTS (bionic)
Supervisord: Version 3.3.1
Tesseract: Version 4.0.0-beta.1
Python: Version 3.6
PyTesseract: Version 0.2.5

ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127357
max locked memory       (kbytes, -l) 16384
max memory size         (kbytes, -m) unlimited
open files                      (-n) 8096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 127357
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Let me know if you need any more information.

Edit 1 (or I know what's NOT the source of this problem)

I am fairly certain that it is not an issue with Supervisord.

When I run one instance from an ssh shell, the function (pytesseract.image_to_string()) is executed smoothly (i.e takes only 0.1s), while there are 10 instances being run via Supervisord.
When I start another instance from a new ssh shell, both the instances (ones started from ssh) run smoothly most of the time.
When I start yet another instance from a new ssh shell, all the three instances start choking, taking around 10s to execute the function. This time keeps on increasing as I add more instances via shell.

So the problem can be replicated even with a shell.

More Information

I ran the program with strace -T -f but I could not figure out what exactly is causing the spike in time.

For a function call that takes 1s

Top 10 system calls sorted by time taken
1.504530    [pid 29921] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 30166
0.503915    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.503472    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.500524    [pid 29933] <... select resumed> )      = 0 (Timeout)
0.500515    [pid 29933] <... select resumed> )      = 0 (Timeout)
0.500514    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.500512    [pid 29933] <... select resumed> )      = 0 (Timeout)
0.069869    [pid 30169] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
0.035989    [pid 30167] <... futex resumed> )       = 0
0.016002    [pid 30168] <... futex resumed> )       = 0

For a function call that takes 9s

Top 10 system calls sorted by time taken
9.795787    [pid 29921] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 30106
0.515960    [pid 29933] <... select resumed> )      = 0 (Timeout)
0.511955    [pid 29933] <... select resumed> )      = 0 (Timeout)
0.507979    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.507968    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.505257    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.503988    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.503978    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.503975    [pid 29932] <... select resumed> )      = 0 (Timeout)
0.503974    [pid 29932] <... select resumed> )      = 0 (Timeout)

Question Credit: rGun
Question Reference
Asked July 21, 2019
Tags: , tesseract
Posted Under: Unix Linux
24 views
1 Answers

Disabling multiprocessing in tesseract fixed the issue. It can be done by setting the OMP_THREAD_LIMIT=1 in the environment.

See https://github.com/tesseract-ocr/tesseract/issues/898#issuecomment-315202167


credit: rGun
Answered July 21, 2019
Your Answer
D:\Adnan\Candoerz\CandoProject\vQA