Most elegant approach for writing JSON data to a relational database using Django Models?


Question

I have a typical Relational Database model laid out in Django where a typical model contains some ForeignKeys, some ManyToManyFields, and some fields that extend Django's DateTimeField.

I want to save data that I am receiving in JSON format (not flat) from an external api. I wan't it such that data gets saved to respective tables (not the whole json string to one field). What is the most clean and simple approach to do this? Is there a library available to make this task simpler?

Here's an example to clarify my question,

Models-

class NinjaData(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)  
    birthdatetime = MyDateTimeField(null=True)
    deathdatetime = MyDatetimeField(null=True)
    skills = models.ManyToManyField(Skills, null=True)
    weapons = models.ManyToManyField(Weapons, null=True)
    master = models.ForeignKey(Master, null=True)

class Skills(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)
    difficulty = models.IntegerField(null=True)

class Weapons(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)
    weight = models.FloatField(null=True)

class Master(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)
    is_awesome = models.NullBooleanField()

now, I typically have to save json string data that I obtain from an external api (secret ninja api) into this model, json looks like this

JSON-

{
"id":"1234",
"name":"Hitori",
"birthdatetime":"11/05/1999 20:30:00",
"skills":[
    {
    "id":"3456",
    "name":"stealth",
    "difficulty":"2"
    },
    {
    "id":"678",
    "name":"karate",
    "difficulty":"1"
    }
],
"weapons":[
    {
    "id":"878",
    "name":"shuriken",
    "weight":"0.2"
    },
    {
    "id":"574",
    "name":"katana",
    "weight":"0.5"
    }
],
"master":{
    "id":"4",
    "name":"Schi fu",
    "is_awesome":"true"
    }
}

now logic for handling a typical ManyToManyField is fairly simple,

logic code -

data = json.loads(ninja_json)
ninja = NinjaData.objects.create(id=data['id'], name=data['name'])

if 'weapons' in data:
    weapons = data['weapons']
    for weapon in weapons:
        w = Weapons.objects.get_or_create(**weapon)  # create a new weapon in Weapon table
        ninja.weapons.add(w)

if 'skills' in data:
    ...
    (skipping rest of the code for brevity)

There are many approaches that i could use,

  • code above logic in the view function that does all the work of converting json to model instances
  • code above logic overriding model's __init__ method
  • code above logic overriding model's save() method
  • create a Manager for each model and code this logic inside each of its methods like create, get_or_create, filter etc.
  • extend ManyToManyField and put it there,
  • an external library?

I would like to know if there is a single most obvious way to save data in this json form to database without coding the above logic multiple times, what would be the most elegant approach that you would suggest?

Thanks all for reading the long post,

1
17
12/4/2011 2:49:11 PM

Accepted Answer

In my opinion the cleanest place for the code you need is as a new Manager method (eg from_json_string) on a custom manager for the NinjaData model.

I don't think you should override the standard create, get_or_create etc methods since you're doing something a bit different from what they normally do and it's good to keep them working normally.

Update: I realised I'd probably want this for myself at some point so I have coded up and lightly tested a generic function. Since it recursively goes through and affects other models I'm no longer certain it belongs as a Manager method and should probably be a stand-alone helper function.

def create_or_update_and_get(model_class, data):
    get_or_create_kwargs = {
        model_class._meta.pk.name: data.pop(model_class._meta.pk.name)
    }
    try:
        # get
        instance = model_class.objects.get(**get_or_create_kwargs)
    except model_class.DoesNotExist:
        # create
        instance = model_class(**get_or_create_kwargs)
    # update (or finish creating)
    for key,value in data.items():
        field = model_class._meta.get_field(key)
        if not field:
            continue
        if isinstance(field, models.ManyToManyField):
            # can't add m2m until parent is saved
            continue
        elif isinstance(field, models.ForeignKey) and hasattr(value, 'items'):
            rel_instance = create_or_update_and_get(field.rel.to, value)
            setattr(instance, key, rel_instance)
        else:
            setattr(instance, key, value)
    instance.save()
    # now add the m2m relations
    for field in model_class._meta.many_to_many:
        if field.name in data and hasattr(data[field.name], 'append'):
            for obj in data[field.name]:
                rel_instance = create_or_update_and_get(field.rel.to, obj)
                getattr(instance, field.name).add(rel_instance)
    return instance

# for example:
from django.utils.simplejson import simplejson as json

data = json.loads(ninja_json)
ninja = create_or_update_and_get(NinjaData, data)
10
12/5/2011 1:14:40 PM

I don't know if you're familiar with the terminology, but what you're basically trying to do is de-serialize from a serialized/string format (in this case, JSON) into Python model objects.

I'm not familiar with Python libraries for doing this with JSON, so I can't recommend/endorse any, but a search using terms like "python", "deserialization", "json", "object", and "graph" seems to reveal some Django documentation for serialization and the library jsonpickle on github.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon