Running "unique" tasks with celery


I use celery to update RSS feeds in my news aggregation site. I use one @task for each feed, and things seem to work nicely.

There's a detail that I'm not sure to handle well though: all feeds are updated once every minute with a @periodic_task, but what if a feed is still updating from the last periodic task when a new one is started ? (for example if the feed is really slow, or offline and the task is held in a retry loop)

Currently I store tasks results and check their status like this:

import socket
from datetime import timedelta
from celery.decorators import task, periodic_task
from aggregator.models import Feed

_results = {}

def fetch_articles():
    for feed in Feed.objects.all():
        if in _results:
            if not _results[].ready():
                # The task is not finished yet
        _results[] = update_feed.delay(feed)

def update_feed(feed):
    except socket.error, exc:
        update_feed.retry(args=[feed], exc=exc)

Maybe there is a more sophisticated/robust way of achieving the same result using some celery mechanism that I missed ?

10/16/2017 6:24:18 PM

Accepted Answer

From the official documentation: Ensuring a task is only executed one at a time.

10/31/2013 4:28:41 PM

Based on MattH's answer, you could use a decorator like this:

def single_instance_task(timeout):
    def task_exc(func):
        def wrapper(*args, **kwargs):
            lock_id = "celery-single-instance-" + func.__name__
            acquire_lock = lambda: cache.add(lock_id, "true", timeout)
            release_lock = lambda: cache.delete(lock_id)
            if acquire_lock():
                    func(*args, **kwargs)
        return wrapper
    return task_exc

then, use it like so...

def fetch_articles()
    yada yada...

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow