Django REST Framework Serialization POST is slow
I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:
Model:
class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()
Serializer:
class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')
View:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()
The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.
I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.
The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.
Any pointers would be greatly appreciated.
python django django-rest-framework
add a comment |
I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:
Model:
class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()
Serializer:
class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')
View:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()
The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.
I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.
The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.
Any pointers would be greatly appreciated.
python django django-rest-framework
It might help you to check out this post.
– Aurora Wang
Nov 18 '18 at 20:43
@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of havingmany = True
in your view.
– Tom
Nov 19 '18 at 8:31
add a comment |
I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:
Model:
class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()
Serializer:
class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')
View:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()
The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.
I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.
The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.
Any pointers would be greatly appreciated.
python django django-rest-framework
I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:
Model:
class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()
Serializer:
class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')
View:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()
The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.
I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.
The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.
Any pointers would be greatly appreciated.
python django django-rest-framework
python django django-rest-framework
asked Nov 18 '18 at 19:38
TomTom
639
639
It might help you to check out this post.
– Aurora Wang
Nov 18 '18 at 20:43
@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of havingmany = True
in your view.
– Tom
Nov 19 '18 at 8:31
add a comment |
It might help you to check out this post.
– Aurora Wang
Nov 18 '18 at 20:43
@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of havingmany = True
in your view.
– Tom
Nov 19 '18 at 8:31
It might help you to check out this post.
– Aurora Wang
Nov 18 '18 at 20:43
It might help you to check out this post.
– Aurora Wang
Nov 18 '18 at 20:43
@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having
many = True
in your view.– Tom
Nov 19 '18 at 8:31
@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having
many = True
in your view.– Tom
Nov 19 '18 at 8:31
add a comment |
2 Answers
2
active
oldest
votes
You can try by overriding the create method of serializer as follows:
def create(self, request):
is_many = True if isinstance(request.data, list) else False
serializer = self.get_serializer(data=request.data, many=is_many)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)
It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.
– Tom
Nov 19 '18 at 8:31
add a comment |
One possible solution is to perform a Django ORM bulk_create()
after you validate the data using your serializer. Your view will then look something like this:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_objects =
for data_object_info in data_serializer.validated_data:
data_objects.append(Data(**data_object_info))
Data.objects.bulk_create(data_objects)
or just the following, if you want a one-liner:
Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])
If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53364738%2fdjango-rest-framework-serialization-post-is-slow%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can try by overriding the create method of serializer as follows:
def create(self, request):
is_many = True if isinstance(request.data, list) else False
serializer = self.get_serializer(data=request.data, many=is_many)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)
It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.
– Tom
Nov 19 '18 at 8:31
add a comment |
You can try by overriding the create method of serializer as follows:
def create(self, request):
is_many = True if isinstance(request.data, list) else False
serializer = self.get_serializer(data=request.data, many=is_many)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)
It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.
– Tom
Nov 19 '18 at 8:31
add a comment |
You can try by overriding the create method of serializer as follows:
def create(self, request):
is_many = True if isinstance(request.data, list) else False
serializer = self.get_serializer(data=request.data, many=is_many)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)
You can try by overriding the create method of serializer as follows:
def create(self, request):
is_many = True if isinstance(request.data, list) else False
serializer = self.get_serializer(data=request.data, many=is_many)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)
answered Nov 19 '18 at 7:12
Nikhil MohanNikhil Mohan
654
654
It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.
– Tom
Nov 19 '18 at 8:31
add a comment |
It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.
– Tom
Nov 19 '18 at 8:31
It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.
– Tom
Nov 19 '18 at 8:31
It seems to me as if you've just pasted code from the comment above which isn't very helpful. Please see my comment as to why this doesn't work.
– Tom
Nov 19 '18 at 8:31
add a comment |
One possible solution is to perform a Django ORM bulk_create()
after you validate the data using your serializer. Your view will then look something like this:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_objects =
for data_object_info in data_serializer.validated_data:
data_objects.append(Data(**data_object_info))
Data.objects.bulk_create(data_objects)
or just the following, if you want a one-liner:
Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])
If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.
add a comment |
One possible solution is to perform a Django ORM bulk_create()
after you validate the data using your serializer. Your view will then look something like this:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_objects =
for data_object_info in data_serializer.validated_data:
data_objects.append(Data(**data_object_info))
Data.objects.bulk_create(data_objects)
or just the following, if you want a one-liner:
Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])
If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.
add a comment |
One possible solution is to perform a Django ORM bulk_create()
after you validate the data using your serializer. Your view will then look something like this:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_objects =
for data_object_info in data_serializer.validated_data:
data_objects.append(Data(**data_object_info))
Data.objects.bulk_create(data_objects)
or just the following, if you want a one-liner:
Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])
If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.
One possible solution is to perform a Django ORM bulk_create()
after you validate the data using your serializer. Your view will then look something like this:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_objects =
for data_object_info in data_serializer.validated_data:
data_objects.append(Data(**data_object_info))
Data.objects.bulk_create(data_objects)
or just the following, if you want a one-liner:
Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])
If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.
answered Jan 23 at 13:27
cepradeepcepradeep
5831618
5831618
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53364738%2fdjango-rest-framework-serialization-post-is-slow%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
It might help you to check out this post.
– Aurora Wang
Nov 18 '18 at 20:43
@AuroraWang I don't think this helps. There is still a query for every validation step and for every creation. I think that this is just a different way of having
many = True
in your view.– Tom
Nov 19 '18 at 8:31