Postgres - non-linear relationship between row count and query speed
up vote
0
down vote
favorite
Why isn't there a direct (linear) relationship between the number of rows being processed and the time taken?
Example - I'm moving rows from one table to another. If I move a million rows it takes about 20 seconds, if I move 10 million rows it doesn't take 200 seconds (about 4 minutes) it takes closer to 20 minutes, and if I move 20 million rows it takes about 2 hours.
Background - I'm merging daily partitions into larger monthly partitions by running the following queries as a single transaction.....
ALTER TABLE table_a DETACH PARTITION table_a_201811;
ALTER TABLE table_a DETACH PARTITION table_a_20181104;
WITH moved_rows AS
(
DELETE FROM table_a_20181104
RETURNING *
)
INSERT INTO table_a_201811
SELECT * FROM moved_rows;
ALTER TABLE table_a ATTACH PARTITION table_a_201811 FOR VALUES FROM ('2018-11-01') TO ('2018-11-05');
DROP TABLE table_a_20181104;
Experimentation indicates that the ALTER TABLE commands for detaching/attaching partitions take only a few seconds (seemingly independently of table size) whilst the middle statement that actual does the transfer takes the bulk of the time.
I had though that if it takes x seconds to move a millions rows it would take 2x seconds to move 2 million rows, and 10x seconds to move 10 million rows. This doesn't seem to be the case. Why not? - and is there a way of improving the performance?
I'm using version 10.5, and the process has exclusive access to the database (no other connections, and no locks showing in pg_locks.
postgresql database-performance
add a comment |
up vote
0
down vote
favorite
Why isn't there a direct (linear) relationship between the number of rows being processed and the time taken?
Example - I'm moving rows from one table to another. If I move a million rows it takes about 20 seconds, if I move 10 million rows it doesn't take 200 seconds (about 4 minutes) it takes closer to 20 minutes, and if I move 20 million rows it takes about 2 hours.
Background - I'm merging daily partitions into larger monthly partitions by running the following queries as a single transaction.....
ALTER TABLE table_a DETACH PARTITION table_a_201811;
ALTER TABLE table_a DETACH PARTITION table_a_20181104;
WITH moved_rows AS
(
DELETE FROM table_a_20181104
RETURNING *
)
INSERT INTO table_a_201811
SELECT * FROM moved_rows;
ALTER TABLE table_a ATTACH PARTITION table_a_201811 FOR VALUES FROM ('2018-11-01') TO ('2018-11-05');
DROP TABLE table_a_20181104;
Experimentation indicates that the ALTER TABLE commands for detaching/attaching partitions take only a few seconds (seemingly independently of table size) whilst the middle statement that actual does the transfer takes the bulk of the time.
I had though that if it takes x seconds to move a millions rows it would take 2x seconds to move 2 million rows, and 10x seconds to move 10 million rows. This doesn't seem to be the case. Why not? - and is there a way of improving the performance?
I'm using version 10.5, and the process has exclusive access to the database (no other connections, and no locks showing in pg_locks.
postgresql database-performance
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Why isn't there a direct (linear) relationship between the number of rows being processed and the time taken?
Example - I'm moving rows from one table to another. If I move a million rows it takes about 20 seconds, if I move 10 million rows it doesn't take 200 seconds (about 4 minutes) it takes closer to 20 minutes, and if I move 20 million rows it takes about 2 hours.
Background - I'm merging daily partitions into larger monthly partitions by running the following queries as a single transaction.....
ALTER TABLE table_a DETACH PARTITION table_a_201811;
ALTER TABLE table_a DETACH PARTITION table_a_20181104;
WITH moved_rows AS
(
DELETE FROM table_a_20181104
RETURNING *
)
INSERT INTO table_a_201811
SELECT * FROM moved_rows;
ALTER TABLE table_a ATTACH PARTITION table_a_201811 FOR VALUES FROM ('2018-11-01') TO ('2018-11-05');
DROP TABLE table_a_20181104;
Experimentation indicates that the ALTER TABLE commands for detaching/attaching partitions take only a few seconds (seemingly independently of table size) whilst the middle statement that actual does the transfer takes the bulk of the time.
I had though that if it takes x seconds to move a millions rows it would take 2x seconds to move 2 million rows, and 10x seconds to move 10 million rows. This doesn't seem to be the case. Why not? - and is there a way of improving the performance?
I'm using version 10.5, and the process has exclusive access to the database (no other connections, and no locks showing in pg_locks.
postgresql database-performance
Why isn't there a direct (linear) relationship between the number of rows being processed and the time taken?
Example - I'm moving rows from one table to another. If I move a million rows it takes about 20 seconds, if I move 10 million rows it doesn't take 200 seconds (about 4 minutes) it takes closer to 20 minutes, and if I move 20 million rows it takes about 2 hours.
Background - I'm merging daily partitions into larger monthly partitions by running the following queries as a single transaction.....
ALTER TABLE table_a DETACH PARTITION table_a_201811;
ALTER TABLE table_a DETACH PARTITION table_a_20181104;
WITH moved_rows AS
(
DELETE FROM table_a_20181104
RETURNING *
)
INSERT INTO table_a_201811
SELECT * FROM moved_rows;
ALTER TABLE table_a ATTACH PARTITION table_a_201811 FOR VALUES FROM ('2018-11-01') TO ('2018-11-05');
DROP TABLE table_a_20181104;
Experimentation indicates that the ALTER TABLE commands for detaching/attaching partitions take only a few seconds (seemingly independently of table size) whilst the middle statement that actual does the transfer takes the bulk of the time.
I had though that if it takes x seconds to move a millions rows it would take 2x seconds to move 2 million rows, and 10x seconds to move 10 million rows. This doesn't seem to be the case. Why not? - and is there a way of improving the performance?
I'm using version 10.5, and the process has exclusive access to the database (no other connections, and no locks showing in pg_locks.
postgresql database-performance
postgresql database-performance
asked Nov 7 at 18:17
Hemel
1948
1948
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53195451%2fpostgres-non-linear-relationship-between-row-count-and-query-speed%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown