Can we load text file separated by :: into hive table?
up vote
1
down vote
favorite
Is there a way to load a simple text file where fields are separated by "::" into hive table other than replacing those "::" with "," and then load it?
Replacing the "::" with "," is quicker when the text file is small but what if contains millions of records?
hive
add a comment |
up vote
1
down vote
favorite
Is there a way to load a simple text file where fields are separated by "::" into hive table other than replacing those "::" with "," and then load it?
Replacing the "::" with "," is quicker when the text file is small but what if contains millions of records?
hive
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Is there a way to load a simple text file where fields are separated by "::" into hive table other than replacing those "::" with "," and then load it?
Replacing the "::" with "," is quicker when the text file is small but what if contains millions of records?
hive
Is there a way to load a simple text file where fields are separated by "::" into hive table other than replacing those "::" with "," and then load it?
Replacing the "::" with "," is quicker when the text file is small but what if contains millions of records?
hive
hive
asked Nov 7 at 18:53
VIN
11111
11111
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
Try creating Hive table using Regex serde
Example:
i had file with below text in it.
i::90
w::99
Create Hive table:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE;
Select from Hive table:
hive> select * from i;
+-------+---------+--+
| i.id | i.name |
+-------+---------+--+
| i | 90 |
| w | 99 |
+-------+---------+--+
In case if you want to skip the header then use below syntax:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE
tblproperties ('skip.header.line.count'='1');
UPDATE:
Check is there any older files in your table location.if some files are there then delete them(if you don't want them).
1.Create Hive table as:
create external table <db_name>.<table_name>
(col1 STRING,
col2 STRING,
col3 string,
col4 string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*?)::(.*?)::(.*)')
STORED AS TEXTFILE;
2.Then run:
load data local inpath 'Source path' overwrite into table 'Destination table'
It gives me error ED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.regexserde
– VIN
Nov 7 at 20:07
@vicky,regexserdeis case sensitive use thisRegexSerDe. Try to create hive table again.
– Shu
Nov 7 at 20:17
Ok, I did that. But this time my table got created but when I load the data and query the table it returns null for all fields which is basically the same I was getting earlier
– VIN
Nov 7 at 20:37
@vicky,what do u mean by load the data?,Create an external table and copy the file into the table location.. Make sure yourregexis correct and able tocapture the groupscorrectly.. if you are still having issues Update the question withsome sample data..!!
– Shu
Nov 7 at 20:56
loading the data means ( load data local inpath 'Source path' overwrite into table 'Destination table') and sample data looks like this (1::914::3::978301968 1::3408::4::978300275 1::2355::5::978824291 1::1197::3::978302268 1::1287::5::978302039 1::2804::5::978300719 1::594::4::978302268 1::919::4::978301368 1::595::5::978824268)
– VIN
Nov 7 at 21:20
|
show 3 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
Try creating Hive table using Regex serde
Example:
i had file with below text in it.
i::90
w::99
Create Hive table:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE;
Select from Hive table:
hive> select * from i;
+-------+---------+--+
| i.id | i.name |
+-------+---------+--+
| i | 90 |
| w | 99 |
+-------+---------+--+
In case if you want to skip the header then use below syntax:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE
tblproperties ('skip.header.line.count'='1');
UPDATE:
Check is there any older files in your table location.if some files are there then delete them(if you don't want them).
1.Create Hive table as:
create external table <db_name>.<table_name>
(col1 STRING,
col2 STRING,
col3 string,
col4 string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*?)::(.*?)::(.*)')
STORED AS TEXTFILE;
2.Then run:
load data local inpath 'Source path' overwrite into table 'Destination table'
It gives me error ED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.regexserde
– VIN
Nov 7 at 20:07
@vicky,regexserdeis case sensitive use thisRegexSerDe. Try to create hive table again.
– Shu
Nov 7 at 20:17
Ok, I did that. But this time my table got created but when I load the data and query the table it returns null for all fields which is basically the same I was getting earlier
– VIN
Nov 7 at 20:37
@vicky,what do u mean by load the data?,Create an external table and copy the file into the table location.. Make sure yourregexis correct and able tocapture the groupscorrectly.. if you are still having issues Update the question withsome sample data..!!
– Shu
Nov 7 at 20:56
loading the data means ( load data local inpath 'Source path' overwrite into table 'Destination table') and sample data looks like this (1::914::3::978301968 1::3408::4::978300275 1::2355::5::978824291 1::1197::3::978302268 1::1287::5::978302039 1::2804::5::978300719 1::594::4::978302268 1::919::4::978301368 1::595::5::978824268)
– VIN
Nov 7 at 21:20
|
show 3 more comments
up vote
0
down vote
accepted
Try creating Hive table using Regex serde
Example:
i had file with below text in it.
i::90
w::99
Create Hive table:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE;
Select from Hive table:
hive> select * from i;
+-------+---------+--+
| i.id | i.name |
+-------+---------+--+
| i | 90 |
| w | 99 |
+-------+---------+--+
In case if you want to skip the header then use below syntax:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE
tblproperties ('skip.header.line.count'='1');
UPDATE:
Check is there any older files in your table location.if some files are there then delete them(if you don't want them).
1.Create Hive table as:
create external table <db_name>.<table_name>
(col1 STRING,
col2 STRING,
col3 string,
col4 string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*?)::(.*?)::(.*)')
STORED AS TEXTFILE;
2.Then run:
load data local inpath 'Source path' overwrite into table 'Destination table'
It gives me error ED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.regexserde
– VIN
Nov 7 at 20:07
@vicky,regexserdeis case sensitive use thisRegexSerDe. Try to create hive table again.
– Shu
Nov 7 at 20:17
Ok, I did that. But this time my table got created but when I load the data and query the table it returns null for all fields which is basically the same I was getting earlier
– VIN
Nov 7 at 20:37
@vicky,what do u mean by load the data?,Create an external table and copy the file into the table location.. Make sure yourregexis correct and able tocapture the groupscorrectly.. if you are still having issues Update the question withsome sample data..!!
– Shu
Nov 7 at 20:56
loading the data means ( load data local inpath 'Source path' overwrite into table 'Destination table') and sample data looks like this (1::914::3::978301968 1::3408::4::978300275 1::2355::5::978824291 1::1197::3::978302268 1::1287::5::978302039 1::2804::5::978300719 1::594::4::978302268 1::919::4::978301368 1::595::5::978824268)
– VIN
Nov 7 at 21:20
|
show 3 more comments
up vote
0
down vote
accepted
up vote
0
down vote
accepted
Try creating Hive table using Regex serde
Example:
i had file with below text in it.
i::90
w::99
Create Hive table:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE;
Select from Hive table:
hive> select * from i;
+-------+---------+--+
| i.id | i.name |
+-------+---------+--+
| i | 90 |
| w | 99 |
+-------+---------+--+
In case if you want to skip the header then use below syntax:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE
tblproperties ('skip.header.line.count'='1');
UPDATE:
Check is there any older files in your table location.if some files are there then delete them(if you don't want them).
1.Create Hive table as:
create external table <db_name>.<table_name>
(col1 STRING,
col2 STRING,
col3 string,
col4 string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*?)::(.*?)::(.*)')
STORED AS TEXTFILE;
2.Then run:
load data local inpath 'Source path' overwrite into table 'Destination table'
Try creating Hive table using Regex serde
Example:
i had file with below text in it.
i::90
w::99
Create Hive table:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE;
Select from Hive table:
hive> select * from i;
+-------+---------+--+
| i.id | i.name |
+-------+---------+--+
| i | 90 |
| w | 99 |
+-------+---------+--+
In case if you want to skip the header then use below syntax:
hive> create external table default.i
(Id STRING,
Name STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*)')
STORED AS TEXTFILE
tblproperties ('skip.header.line.count'='1');
UPDATE:
Check is there any older files in your table location.if some files are there then delete them(if you don't want them).
1.Create Hive table as:
create external table <db_name>.<table_name>
(col1 STRING,
col2 STRING,
col3 string,
col4 string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex' = '(.*?)::(.*?)::(.*?)::(.*)')
STORED AS TEXTFILE;
2.Then run:
load data local inpath 'Source path' overwrite into table 'Destination table'
edited Nov 7 at 22:09
answered Nov 7 at 19:12
Shu
3,8712418
3,8712418
It gives me error ED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.regexserde
– VIN
Nov 7 at 20:07
@vicky,regexserdeis case sensitive use thisRegexSerDe. Try to create hive table again.
– Shu
Nov 7 at 20:17
Ok, I did that. But this time my table got created but when I load the data and query the table it returns null for all fields which is basically the same I was getting earlier
– VIN
Nov 7 at 20:37
@vicky,what do u mean by load the data?,Create an external table and copy the file into the table location.. Make sure yourregexis correct and able tocapture the groupscorrectly.. if you are still having issues Update the question withsome sample data..!!
– Shu
Nov 7 at 20:56
loading the data means ( load data local inpath 'Source path' overwrite into table 'Destination table') and sample data looks like this (1::914::3::978301968 1::3408::4::978300275 1::2355::5::978824291 1::1197::3::978302268 1::1287::5::978302039 1::2804::5::978300719 1::594::4::978302268 1::919::4::978301368 1::595::5::978824268)
– VIN
Nov 7 at 21:20
|
show 3 more comments
It gives me error ED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.regexserde
– VIN
Nov 7 at 20:07
@vicky,regexserdeis case sensitive use thisRegexSerDe. Try to create hive table again.
– Shu
Nov 7 at 20:17
Ok, I did that. But this time my table got created but when I load the data and query the table it returns null for all fields which is basically the same I was getting earlier
– VIN
Nov 7 at 20:37
@vicky,what do u mean by load the data?,Create an external table and copy the file into the table location.. Make sure yourregexis correct and able tocapture the groupscorrectly.. if you are still having issues Update the question withsome sample data..!!
– Shu
Nov 7 at 20:56
loading the data means ( load data local inpath 'Source path' overwrite into table 'Destination table') and sample data looks like this (1::914::3::978301968 1::3408::4::978300275 1::2355::5::978824291 1::1197::3::978302268 1::1287::5::978302039 1::2804::5::978300719 1::594::4::978302268 1::919::4::978301368 1::595::5::978824268)
– VIN
Nov 7 at 21:20
It gives me error ED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.regexserde
– VIN
Nov 7 at 20:07
It gives me error ED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.regexserde
– VIN
Nov 7 at 20:07
@vicky,
regexserde is case sensitive use this RegexSerDe. Try to create hive table again.– Shu
Nov 7 at 20:17
@vicky,
regexserde is case sensitive use this RegexSerDe. Try to create hive table again.– Shu
Nov 7 at 20:17
Ok, I did that. But this time my table got created but when I load the data and query the table it returns null for all fields which is basically the same I was getting earlier
– VIN
Nov 7 at 20:37
Ok, I did that. But this time my table got created but when I load the data and query the table it returns null for all fields which is basically the same I was getting earlier
– VIN
Nov 7 at 20:37
@vicky,what do u mean by load the data?,Create an external table and copy the file into the table location.. Make sure your
regex is correct and able to capture the groups correctly.. if you are still having issues Update the question with some sample data..!!– Shu
Nov 7 at 20:56
@vicky,what do u mean by load the data?,Create an external table and copy the file into the table location.. Make sure your
regex is correct and able to capture the groups correctly.. if you are still having issues Update the question with some sample data..!!– Shu
Nov 7 at 20:56
loading the data means ( load data local inpath 'Source path' overwrite into table 'Destination table') and sample data looks like this (1::914::3::978301968 1::3408::4::978300275 1::2355::5::978824291 1::1197::3::978302268 1::1287::5::978302039 1::2804::5::978300719 1::594::4::978302268 1::919::4::978301368 1::595::5::978824268)
– VIN
Nov 7 at 21:20
loading the data means ( load data local inpath 'Source path' overwrite into table 'Destination table') and sample data looks like this (1::914::3::978301968 1::3408::4::978300275 1::2355::5::978824291 1::1197::3::978302268 1::1287::5::978302039 1::2804::5::978300719 1::594::4::978302268 1::919::4::978301368 1::595::5::978824268)
– VIN
Nov 7 at 21:20
|
show 3 more comments
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53195963%2fcan-we-load-text-file-separated-by-into-hive-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown