Django's update_or_create failing although kwargs is specified

April 4, 2024

I have the following Django model:

class Itens(models.Model):
    id = models.AutoField(primary_key=True)
    itempr_id = models.IntegerField(unique=True) # This is NOT a relationship
    cod_item = models.CharField(max_length=16, unique=True)
    # Other fields...

As you can see, both itempr_id and cod_item are unique, so I specified update_and_create to take this into account:

class CreateItensSerializer(serializers.ModelSerializer):
    class Meta:
        model = Itens
        fields = '__all__'

    def create(self, validated_data):
        item, created = Itens.objects.update_or_create(
            itempr_id=validated_data.get('itempr_id'),
            cod_item=validated_data.get('cod_item'),
            defaults=validated_data
        )

        return item

I call and save the serializer like this:

serializer = CreateItensSerializer(data=itens, many=True)
if serializer.is_valid():
       serializer.save()

However, whenever there is a duplicate, I get the following error:

{'itempr_id': [ErrorDetail(string='itens with this itempr id already exists.', code='unique')], 'cod_item': [ErrorDetail(string='itens with this cod item already exists.', code='unique')]}

I do not expect any uniqueness violation problems since the fields were already specified to be unique in the function call

How can I fix this?

EDIT: just to make it clear, this works if the DB’s table is empty, there’s no problem in the itens argument in serializer = CreateItensSerializer(data=itens, many=True)

>Solution :

In your modeling, you made both itempr_id and cod_item unique, yes. But these are individually unique. This means that if there is a record with itempr_id=42, then no other row can have itempr_id=42.

Your .update_or_create(…) ^[Django-doc] will update a row, if both the itempr_id and cod_item match. But this is thus something different. Indeed, if we update or create an item with itempr_id=42 and cod_item='A', and there is already a record with itempr_id=42, but with cod_item='B', it will try to create a new record, and therefore thus raise an error. If the fields are unique together, meaning that itempr_id=42 can occur multiple times, and cod_items='A' can occur multiple times, but the combination can occur at most once, then .update_or_create(…) would have worked.

The main issue is probably not that much technical, but more functional: if two or more columns are unique, and you work with .update_or_create(…), then which row should be picked to update in case of a collision? The one with the already existing itempr_id, or with the one with the already existing cod_items? Why the former over the latter?

If we would orient first on the itempr_id, we could work with:

def create(self, validated_data):
    updated = Itens.objects.filter(
        itempr_id=validated_data.get('itempr_id')
    ).update(**validated_data)
    if not updated:
        updated = Itens.objects.filter(
            cod_item=validated_data.get('cod_item'),
        ).update(**validated_data)
    if not updated:
        Item.objects.create(**validated_data)
    return item

which is essentially what .update_or_create(…) is doing, except that it uses the combination of the items to filter, although that might be a bit more efficient.

But if we use .update_or_create(…) to just silence integrity errors, that is likely caused by bad modeling: it is not very common that a model carries two or more (non-primary key) unique columns if these are unique individually. Typically the combination should be unique.