Use ID file to replace some columns in dataset then print entire datset












1















Still new to coding and having difficulty connecting commands to get the required output.



I have a data file in the following format:



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9


In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.



This is what I am trying to achieve



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9


I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:



awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2


Any direction would be really appreciated!










share|improve this question

























  • Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

    – RavinderSingh13
    Nov 14 '18 at 1:41
















1















Still new to coding and having difficulty connecting commands to get the required output.



I have a data file in the following format:



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9


In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.



This is what I am trying to achieve



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9


I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:



awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2


Any direction would be really appreciated!










share|improve this question

























  • Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

    – RavinderSingh13
    Nov 14 '18 at 1:41














1












1








1








Still new to coding and having difficulty connecting commands to get the required output.



I have a data file in the following format:



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9


In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.



This is what I am trying to achieve



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9


I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:



awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2


Any direction would be really appreciated!










share|improve this question
















Still new to coding and having difficulty connecting commands to get the required output.



I have a data file in the following format:



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9


In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.



This is what I am trying to achieve



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9


I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:



awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2


Any direction would be really appreciated!







dataframe awk sed columnsorting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 '18 at 1:58







Cae.rich

















asked Nov 14 '18 at 0:34









Cae.richCae.rich

244




244













  • Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

    – RavinderSingh13
    Nov 14 '18 at 1:41



















  • Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

    – RavinderSingh13
    Nov 14 '18 at 1:41

















Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

– RavinderSingh13
Nov 14 '18 at 1:41





Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

– RavinderSingh13
Nov 14 '18 at 1:41












1 Answer
1






active

oldest

votes


















1














Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr or NationalCowID so it could be dynamic too. Could you please try following once.



awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file


Output will be as follows.



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*





share|improve this answer


























  • Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

    – Cae.rich
    Nov 14 '18 at 1:50













  • @Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

    – RavinderSingh13
    Nov 14 '18 at 1:53






  • 1





    @ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

    – Cae.rich
    Nov 14 '18 at 2:05











  • @Cae.rich, your welcome.

    – RavinderSingh13
    Nov 14 '18 at 2:06











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53291486%2fuse-id-file-to-replace-some-columns-in-dataset-then-print-entire-datset%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr or NationalCowID so it could be dynamic too. Could you please try following once.



awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file


Output will be as follows.



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*





share|improve this answer


























  • Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

    – Cae.rich
    Nov 14 '18 at 1:50













  • @Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

    – RavinderSingh13
    Nov 14 '18 at 1:53






  • 1





    @ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

    – Cae.rich
    Nov 14 '18 at 2:05











  • @Cae.rich, your welcome.

    – RavinderSingh13
    Nov 14 '18 at 2:06
















1














Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr or NationalCowID so it could be dynamic too. Could you please try following once.



awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file


Output will be as follows.



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*





share|improve this answer


























  • Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

    – Cae.rich
    Nov 14 '18 at 1:50













  • @Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

    – RavinderSingh13
    Nov 14 '18 at 1:53






  • 1





    @ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

    – Cae.rich
    Nov 14 '18 at 2:05











  • @Cae.rich, your welcome.

    – RavinderSingh13
    Nov 14 '18 at 2:06














1












1








1







Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr or NationalCowID so it could be dynamic too. Could you please try following once.



awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file


Output will be as follows.



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*





share|improve this answer















Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr or NationalCowID so it could be dynamic too. Could you please try following once.



awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file


Output will be as follows.



NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 14 '18 at 1:56

























answered Nov 14 '18 at 1:40









RavinderSingh13RavinderSingh13

26.1k41438




26.1k41438













  • Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

    – Cae.rich
    Nov 14 '18 at 1:50













  • @Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

    – RavinderSingh13
    Nov 14 '18 at 1:53






  • 1





    @ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

    – Cae.rich
    Nov 14 '18 at 2:05











  • @Cae.rich, your welcome.

    – RavinderSingh13
    Nov 14 '18 at 2:06



















  • Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

    – Cae.rich
    Nov 14 '18 at 1:50













  • @Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

    – RavinderSingh13
    Nov 14 '18 at 1:53






  • 1





    @ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

    – Cae.rich
    Nov 14 '18 at 2:05











  • @Cae.rich, your welcome.

    – RavinderSingh13
    Nov 14 '18 at 2:06

















Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50







Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50















@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53





@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53




1




1





@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05





@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05













@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06





@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53291486%2fuse-id-file-to-replace-some-columns-in-dataset-then-print-entire-datset%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Hercules Kyvelos

Tangent Lines Diagram Along Smooth Curve

Yusuf al-Mu'taman ibn Hud