How to grep out the first file path in python











up vote
0
down vote

favorite












I am always headache with regex but guess it might be the way to do it. Here is the string I have:



-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet


All I want to grep out is the file's full path:




hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet




Thank you very much.










share|improve this question






















  • Will the file always begin with hdfs ?
    – Rodolfo Donã Hosp
    Nov 5 at 17:01















up vote
0
down vote

favorite












I am always headache with regex but guess it might be the way to do it. Here is the string I have:



-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet


All I want to grep out is the file's full path:




hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet




Thank you very much.










share|improve this question






















  • Will the file always begin with hdfs ?
    – Rodolfo Donã Hosp
    Nov 5 at 17:01













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I am always headache with regex but guess it might be the way to do it. Here is the string I have:



-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet


All I want to grep out is the file's full path:




hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet




Thank you very much.










share|improve this question













I am always headache with regex but guess it might be the way to do it. Here is the string I have:



-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet


All I want to grep out is the file's full path:




hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet




Thank you very much.







python regex grep






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 5 at 16:59









mdivk

53821024




53821024












  • Will the file always begin with hdfs ?
    – Rodolfo Donã Hosp
    Nov 5 at 17:01


















  • Will the file always begin with hdfs ?
    – Rodolfo Donã Hosp
    Nov 5 at 17:01
















Will the file always begin with hdfs ?
– Rodolfo Donã Hosp
Nov 5 at 17:01




Will the file always begin with hdfs ?
– Rodolfo Donã Hosp
Nov 5 at 17:01












1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










Why not just take the last value of the space-separated string?



x = "-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet"
parts = [y for y in x.split(' ') if y] # removes empty strings
fname = parts[-1]





share|improve this answer





















  • Thank you, yes I just realized that :)
    – mdivk
    Nov 5 at 17:04










  • If your format is really fixed, as in an auto generated log file, this answer is probably easiest than using a length regex to fish out the path +1.
    – Tim Biegeleisen
    Nov 5 at 17:06











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53158892%2fhow-to-grep-out-the-first-file-path-in-python%23new-answer', 'question_page');
}
);

Post as a guest
































1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote



accepted










Why not just take the last value of the space-separated string?



x = "-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet"
parts = [y for y in x.split(' ') if y] # removes empty strings
fname = parts[-1]





share|improve this answer





















  • Thank you, yes I just realized that :)
    – mdivk
    Nov 5 at 17:04










  • If your format is really fixed, as in an auto generated log file, this answer is probably easiest than using a length regex to fish out the path +1.
    – Tim Biegeleisen
    Nov 5 at 17:06















up vote
1
down vote



accepted










Why not just take the last value of the space-separated string?



x = "-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet"
parts = [y for y in x.split(' ') if y] # removes empty strings
fname = parts[-1]





share|improve this answer





















  • Thank you, yes I just realized that :)
    – mdivk
    Nov 5 at 17:04










  • If your format is really fixed, as in an auto generated log file, this answer is probably easiest than using a length regex to fish out the path +1.
    – Tim Biegeleisen
    Nov 5 at 17:06













up vote
1
down vote



accepted







up vote
1
down vote



accepted






Why not just take the last value of the space-separated string?



x = "-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet"
parts = [y for y in x.split(' ') if y] # removes empty strings
fname = parts[-1]





share|improve this answer












Why not just take the last value of the space-separated string?



x = "-rw-rw----+  3 userabc clouderausersdev   12267543 2018-02-05 16:41 hdfs://nameservice1/client/abc/scenarios/warehouse/product/tdb_histscen_2/part-00000-6fa2e019-96e5-4280-b2fc-994917013a6a-c000.snappy.parquet"
parts = [y for y in x.split(' ') if y] # removes empty strings
fname = parts[-1]






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 5 at 17:03









wpercy

6,04841933




6,04841933












  • Thank you, yes I just realized that :)
    – mdivk
    Nov 5 at 17:04










  • If your format is really fixed, as in an auto generated log file, this answer is probably easiest than using a length regex to fish out the path +1.
    – Tim Biegeleisen
    Nov 5 at 17:06


















  • Thank you, yes I just realized that :)
    – mdivk
    Nov 5 at 17:04










  • If your format is really fixed, as in an auto generated log file, this answer is probably easiest than using a length regex to fish out the path +1.
    – Tim Biegeleisen
    Nov 5 at 17:06
















Thank you, yes I just realized that :)
– mdivk
Nov 5 at 17:04




Thank you, yes I just realized that :)
– mdivk
Nov 5 at 17:04












If your format is really fixed, as in an auto generated log file, this answer is probably easiest than using a length regex to fish out the path +1.
– Tim Biegeleisen
Nov 5 at 17:06




If your format is really fixed, as in an auto generated log file, this answer is probably easiest than using a length regex to fish out the path +1.
– Tim Biegeleisen
Nov 5 at 17:06


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53158892%2fhow-to-grep-out-the-first-file-path-in-python%23new-answer', 'question_page');
}
);

Post as a guest




















































































這個網誌中的熱門文章

Xamarin.form Move up view when keyboard appear

Post-Redirect-Get with Spring WebFlux and Thymeleaf

Anylogic : not able to use stopDelay()