Django REST Framework Serialization POST is slow












2















I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:



Model:



class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()


Serializer:



class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')


View:



class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()


The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.



I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.



The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.



Any pointers would be greatly appreciated.










share|improve this question























  • It might help you to check out this post.

    – Aurora Wang
    Nov 18 '18 at 20:43











  • @AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having many = True in your view.

    – Tom
    Nov 19 '18 at 8:31
















2















I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:



Model:



class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()


Serializer:



class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')


View:



class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()


The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.



I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.



The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.



Any pointers would be greatly appreciated.










share|improve this question























  • It might help you to check out this post.

    – Aurora Wang
    Nov 18 '18 at 20:43











  • @AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having many = True in your view.

    – Tom
    Nov 19 '18 at 8:31














2












2








2








I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:



Model:



class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()


Serializer:



class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')


View:



class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()


The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.



I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.



The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.



Any pointers would be greatly appreciated.










share|improve this question














I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:



Model:



class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()


Serializer:



class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')


View:



class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()


The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.



I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.



The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.



Any pointers would be greatly appreciated.







python django django-rest-framework






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 18 '18 at 19:38









TomTom

639




639













  • It might help you to check out this post.

    – Aurora Wang
    Nov 18 '18 at 20:43











  • @AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having many = True in your view.

    – Tom
    Nov 19 '18 at 8:31



















  • It might help you to check out this post.

    – Aurora Wang
    Nov 18 '18 at 20:43











  • @AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having many = True in your view.

    – Tom
    Nov 19 '18 at 8:31

















It might help you to check out this post.

– Aurora Wang
Nov 18 '18 at 20:43





It might help you to check out this post.

– Aurora Wang
Nov 18 '18 at 20:43













@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having many = True in your view.

– Tom
Nov 19 '18 at 8:31





@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having many = True in your view.

– Tom
Nov 19 '18 at 8:31












2 Answers
2






active

oldest

votes


















0














You can try by overriding the create method of serializer as follows:



def create(self, request):
is_many = True if isinstance(request.data, list) else False

serializer = self.get_serializer(data=request.data, many=is_many)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)





share|improve this answer
























  • It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.

    – Tom
    Nov 19 '18 at 8:31



















0














One possible solution is to perform a Django ORM bulk_create() after you validate the data using your serializer. Your view will then look something like this:



class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_objects =
for data_object_info in data_serializer.validated_data:
data_objects.append(Data(**data_object_info))
Data.objects.bulk_create(data_objects)


or just the following, if you want a one-liner:



Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])


If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53364738%2fdjango-rest-framework-serialization-post-is-slow%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    You can try by overriding the create method of serializer as follows:



    def create(self, request):
    is_many = True if isinstance(request.data, list) else False

    serializer = self.get_serializer(data=request.data, many=is_many)
    serializer.is_valid(raise_exception=True)
    self.perform_create(serializer)
    headers = self.get_success_headers(serializer.data)
    return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)





    share|improve this answer
























    • It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.

      – Tom
      Nov 19 '18 at 8:31
















    0














    You can try by overriding the create method of serializer as follows:



    def create(self, request):
    is_many = True if isinstance(request.data, list) else False

    serializer = self.get_serializer(data=request.data, many=is_many)
    serializer.is_valid(raise_exception=True)
    self.perform_create(serializer)
    headers = self.get_success_headers(serializer.data)
    return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)





    share|improve this answer
























    • It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.

      – Tom
      Nov 19 '18 at 8:31














    0












    0








    0







    You can try by overriding the create method of serializer as follows:



    def create(self, request):
    is_many = True if isinstance(request.data, list) else False

    serializer = self.get_serializer(data=request.data, many=is_many)
    serializer.is_valid(raise_exception=True)
    self.perform_create(serializer)
    headers = self.get_success_headers(serializer.data)
    return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)





    share|improve this answer













    You can try by overriding the create method of serializer as follows:



    def create(self, request):
    is_many = True if isinstance(request.data, list) else False

    serializer = self.get_serializer(data=request.data, many=is_many)
    serializer.is_valid(raise_exception=True)
    self.perform_create(serializer)
    headers = self.get_success_headers(serializer.data)
    return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 19 '18 at 7:12









    Nikhil MohanNikhil Mohan

    654




    654













    • It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.

      – Tom
      Nov 19 '18 at 8:31



















    • It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.

      – Tom
      Nov 19 '18 at 8:31

















    It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.

    – Tom
    Nov 19 '18 at 8:31





    It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.

    – Tom
    Nov 19 '18 at 8:31













    0














    One possible solution is to perform a Django ORM bulk_create() after you validate the data using your serializer. Your view will then look something like this:



    class DataView(generics.CreateAPIView):
    def create(self, request, pk, format=None):
    data_serializer = DataSerializer(data=request.data, many=True)
    if data_serializer.is_valid():
    data_objects =
    for data_object_info in data_serializer.validated_data:
    data_objects.append(Data(**data_object_info))
    Data.objects.bulk_create(data_objects)


    or just the following, if you want a one-liner:



    Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])


    If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.






    share|improve this answer




























      0














      One possible solution is to perform a Django ORM bulk_create() after you validate the data using your serializer. Your view will then look something like this:



      class DataView(generics.CreateAPIView):
      def create(self, request, pk, format=None):
      data_serializer = DataSerializer(data=request.data, many=True)
      if data_serializer.is_valid():
      data_objects =
      for data_object_info in data_serializer.validated_data:
      data_objects.append(Data(**data_object_info))
      Data.objects.bulk_create(data_objects)


      or just the following, if you want a one-liner:



      Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])


      If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.






      share|improve this answer


























        0












        0








        0







        One possible solution is to perform a Django ORM bulk_create() after you validate the data using your serializer. Your view will then look something like this:



        class DataView(generics.CreateAPIView):
        def create(self, request, pk, format=None):
        data_serializer = DataSerializer(data=request.data, many=True)
        if data_serializer.is_valid():
        data_objects =
        for data_object_info in data_serializer.validated_data:
        data_objects.append(Data(**data_object_info))
        Data.objects.bulk_create(data_objects)


        or just the following, if you want a one-liner:



        Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])


        If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.






        share|improve this answer













        One possible solution is to perform a Django ORM bulk_create() after you validate the data using your serializer. Your view will then look something like this:



        class DataView(generics.CreateAPIView):
        def create(self, request, pk, format=None):
        data_serializer = DataSerializer(data=request.data, many=True)
        if data_serializer.is_valid():
        data_objects =
        for data_object_info in data_serializer.validated_data:
        data_objects.append(Data(**data_object_info))
        Data.objects.bulk_create(data_objects)


        or just the following, if you want a one-liner:



        Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])


        If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 23 at 13:27









        cepradeepcepradeep

        5831618




        5831618






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53364738%2fdjango-rest-framework-serialization-post-is-slow%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            這個網誌中的熱門文章

            Tangent Lines Diagram Along Smooth Curve

            Yusuf al-Mu'taman ibn Hud

            Zucchini