How to query a table partitioned on a column in AWS Athena that uses Presto
If I have created a table like this in AWS Athena:
CREATE EXTERNAL TABLE table (
`timestamp` BIGINT,
`id` STRING,
)PARTITIONED BY (
date_column STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://bucket/key' TBLPROPERTIES ( 'parquet.compress'='SNAPPY', 'CrawlerSchemaDeserializerVersion'='1.0', 'CrawlerSchemaSerializerVersion'='1.0', 'classification'='parquet')
And after adding data, date_column looks like this:
date_column
date=2018102300
date=2018091500 //(so Sept 15, 2018)
I want to get data only for the month of September but unable to frame the correct query:
So far I have this which throws date format error:
SELECT * FROM table
where date_parse(date_column, 'date=%Y%m%d') >= date_parse('date=2018090100', 'date=%Y%m%d') and date_parse(date_column, 'date=%Y%m%d') < date_parse('date=2018100100', 'date=%Y%m%d')
sql amazon-athena prestodb
add a comment |
If I have created a table like this in AWS Athena:
CREATE EXTERNAL TABLE table (
`timestamp` BIGINT,
`id` STRING,
)PARTITIONED BY (
date_column STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://bucket/key' TBLPROPERTIES ( 'parquet.compress'='SNAPPY', 'CrawlerSchemaDeserializerVersion'='1.0', 'CrawlerSchemaSerializerVersion'='1.0', 'classification'='parquet')
And after adding data, date_column looks like this:
date_column
date=2018102300
date=2018091500 //(so Sept 15, 2018)
I want to get data only for the month of September but unable to frame the correct query:
So far I have this which throws date format error:
SELECT * FROM table
where date_parse(date_column, 'date=%Y%m%d') >= date_parse('date=2018090100', 'date=%Y%m%d') and date_parse(date_column, 'date=%Y%m%d') < date_parse('date=2018100100', 'date=%Y%m%d')
sql amazon-athena prestodb
Why do you store "date=2018102300" instead of "2018102300"?
– j.b.gorski
Nov 17 '18 at 23:25
add a comment |
If I have created a table like this in AWS Athena:
CREATE EXTERNAL TABLE table (
`timestamp` BIGINT,
`id` STRING,
)PARTITIONED BY (
date_column STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://bucket/key' TBLPROPERTIES ( 'parquet.compress'='SNAPPY', 'CrawlerSchemaDeserializerVersion'='1.0', 'CrawlerSchemaSerializerVersion'='1.0', 'classification'='parquet')
And after adding data, date_column looks like this:
date_column
date=2018102300
date=2018091500 //(so Sept 15, 2018)
I want to get data only for the month of September but unable to frame the correct query:
So far I have this which throws date format error:
SELECT * FROM table
where date_parse(date_column, 'date=%Y%m%d') >= date_parse('date=2018090100', 'date=%Y%m%d') and date_parse(date_column, 'date=%Y%m%d') < date_parse('date=2018100100', 'date=%Y%m%d')
sql amazon-athena prestodb
If I have created a table like this in AWS Athena:
CREATE EXTERNAL TABLE table (
`timestamp` BIGINT,
`id` STRING,
)PARTITIONED BY (
date_column STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://bucket/key' TBLPROPERTIES ( 'parquet.compress'='SNAPPY', 'CrawlerSchemaDeserializerVersion'='1.0', 'CrawlerSchemaSerializerVersion'='1.0', 'classification'='parquet')
And after adding data, date_column looks like this:
date_column
date=2018102300
date=2018091500 //(so Sept 15, 2018)
I want to get data only for the month of September but unable to frame the correct query:
So far I have this which throws date format error:
SELECT * FROM table
where date_parse(date_column, 'date=%Y%m%d') >= date_parse('date=2018090100', 'date=%Y%m%d') and date_parse(date_column, 'date=%Y%m%d') < date_parse('date=2018100100', 'date=%Y%m%d')
sql amazon-athena prestodb
sql amazon-athena prestodb
asked Nov 16 '18 at 21:46
AtihskaAtihska
9951434
9951434
Why do you store "date=2018102300" instead of "2018102300"?
– j.b.gorski
Nov 17 '18 at 23:25
add a comment |
Why do you store "date=2018102300" instead of "2018102300"?
– j.b.gorski
Nov 17 '18 at 23:25
Why do you store "date=2018102300" instead of "2018102300"?
– j.b.gorski
Nov 17 '18 at 23:25
Why do you store "date=2018102300" instead of "2018102300"?
– j.b.gorski
Nov 17 '18 at 23:25
add a comment |
1 Answer
1
active
oldest
votes
The parameters which you are passing to function date_parse() are incorrect.It should be in below format to fetch correct timestamp format
select date_parse('2018091500', '%Y%m%d%H') will fetch you 2018-09-15 00:00:00.000
You can rewrite your query to fetch results for September
select * from table where date_parse(date_column, '%Y%m%d%H') between date '2018-09-01' and date '2018-09-30'
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53345867%2fhow-to-query-a-table-partitioned-on-a-column-in-aws-athena-that-uses-presto%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The parameters which you are passing to function date_parse() are incorrect.It should be in below format to fetch correct timestamp format
select date_parse('2018091500', '%Y%m%d%H') will fetch you 2018-09-15 00:00:00.000
You can rewrite your query to fetch results for September
select * from table where date_parse(date_column, '%Y%m%d%H') between date '2018-09-01' and date '2018-09-30'
add a comment |
The parameters which you are passing to function date_parse() are incorrect.It should be in below format to fetch correct timestamp format
select date_parse('2018091500', '%Y%m%d%H') will fetch you 2018-09-15 00:00:00.000
You can rewrite your query to fetch results for September
select * from table where date_parse(date_column, '%Y%m%d%H') between date '2018-09-01' and date '2018-09-30'
add a comment |
The parameters which you are passing to function date_parse() are incorrect.It should be in below format to fetch correct timestamp format
select date_parse('2018091500', '%Y%m%d%H') will fetch you 2018-09-15 00:00:00.000
You can rewrite your query to fetch results for September
select * from table where date_parse(date_column, '%Y%m%d%H') between date '2018-09-01' and date '2018-09-30'
The parameters which you are passing to function date_parse() are incorrect.It should be in below format to fetch correct timestamp format
select date_parse('2018091500', '%Y%m%d%H') will fetch you 2018-09-15 00:00:00.000
You can rewrite your query to fetch results for September
select * from table where date_parse(date_column, '%Y%m%d%H') between date '2018-09-01' and date '2018-09-30'
answered Dec 21 '18 at 18:29
bdcloudbdcloud
422410
422410
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53345867%2fhow-to-query-a-table-partitioned-on-a-column-in-aws-athena-that-uses-presto%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Why do you store "date=2018102300" instead of "2018102300"?
– j.b.gorski
Nov 17 '18 at 23:25