For side project I am working on I have a problem with long running tasks that blocks requests inside http server for 1 to 30 seconds depending on the task. I decided to use flask for this project since it have nice blueprint feature that each could be transformed into microservice when necessary. To scale my http servers and distribute workload among other machines I need some broker / publish subscribe / some sort of queue. Since all my backend stack is in python for rapid development I decided to use celery with redis. Redis because I can also use it as session store and leverage as cache layer and I don’t need to introduce another element or language if I try ex. RabbitMQ I also decided to store my task results in postgresql since I like sql database to store important data and I can use one of the nice plugins for flask - Flask SQLAlchemy And also I can later convert my database to some cloud solution like this great cockroachdb created by former google employees without scarifying code since it’s using postgres driver. Since I wrote about almost whole stack but one I want to mention last part which is haproxy that will be my load balancer of choice. I will draw whole part in great opensource UML plugin/standalone tool umlet. But please forgive me uml champions for my inconvenient diagram. Ok so now to the celery itself since it’s healthy to eat it. First I want all my workers be classes so I can define some internal methods if I want to. Obviously I am not building distributed calculator from examples but actual application ( I hope so ) So instead of this :
my actual task looks like this:
and it’s located in some file called
test.py in package
my celery task runner
runner.py on the other hand besides some checking if database for results exist looks like that:
let me also paste here details from my config file named
conf located in
with that in place I can start my worker from command line using:
celery --app=runner worker --loglevel=info
it would result with nice log output where I can check that my task
Now if I would like to use database other then sqlite I need to check if connection and database exists. See documentation for details.
Ok so now I can run my
located in test directory
from command line:
and get response:
on the other hand in celery console I got:
I can also see my peresisted result inside
result.db using great opensource sqlitebrowser.
There should be one entry inside
celery_taskmeta table and if we click twice into
result there should be
hello siema inside that blob.
So that’s it I can now leverage multiple files with workers and use them as I want to. Also I can run 2 workers in 2 terminals to see which one would pick task and how the tasks are distributed.
All the code from this celery example is available on github here.