Use ID file to replace some columns in dataset then print entire datset

Still new to coding and having difficulty connecting commands to get the required output.

I have a data file in the following format:

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,6,2,92,1,2014,424.4410055,NA,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,6,2,91,1,2014,380.94688,NA,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,6,2,91,1,2014,380.94688,NA,9

In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.

This is what I am trying to achieve

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,*,*,*,*,*,*,*,*,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,*,*,*,*,*,*,*,*,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,*,*,*,*,*,*,*,*,9

I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:

awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2

Any direction would be really appreciated!

edited Nov 14 '18 at 1:58

asked Nov 14 '18 at 0:34

Cae.rich

244

Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

– RavinderSingh13
Nov 14 '18 at 1:41

add a comment |

Still new to coding and having difficulty connecting commands to get the required output.

I have a data file in the following format:

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,6,2,92,1,2014,424.4410055,NA,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,6,2,91,1,2014,380.94688,NA,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,6,2,91,1,2014,380.94688,NA,9

In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.

This is what I am trying to achieve

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,*,*,*,*,*,*,*,*,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,*,*,*,*,*,*,*,*,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,*,*,*,*,*,*,*,*,9

I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:

awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2

Any direction would be really appreciated!

edited Nov 14 '18 at 1:58

asked Nov 14 '18 at 0:34

Cae.rich

244

Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

– RavinderSingh13
Nov 14 '18 at 1:41

add a comment |

Still new to coding and having difficulty connecting commands to get the required output.

I have a data file in the following format:

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,6,2,92,1,2014,424.4410055,NA,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,6,2,91,1,2014,380.94688,NA,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,6,2,91,1,2014,380.94688,NA,9

In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.

This is what I am trying to achieve

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,*,*,*,*,*,*,*,*,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,*,*,*,*,*,*,*,*,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,*,*,*,*,*,*,*,*,9

I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:

awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2

Any direction would be really appreciated!

edited Nov 14 '18 at 1:58

asked Nov 14 '18 at 0:34

Cae.rich

244

Still new to coding and having difficulty connecting commands to get the required output.

I have a data file in the following format:

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,6,2,92,1,2014,424.4410055,NA,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,6,2,91,1,2014,380.94688,NA,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,6,2,91,1,2014,380.94688,NA,9

In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.

This is what I am trying to achieve

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,20141208,*,*,*,*,*,*,*,*,1

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,20141207,*,*,*,*,*,*,*,*,6

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,20141207,*,*,*,*,*,*,*,*,9

I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:

awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2

Any direction would be really appreciated!

dataframe awk sed columnsorting

edited Nov 14 '18 at 1:58

asked Nov 14 '18 at 0:34

Cae.rich

244

edited Nov 14 '18 at 1:58

asked Nov 14 '18 at 0:34

Cae.rich

244

edited Nov 14 '18 at 1:58

asked Nov 14 '18 at 0:34

Cae.rich

244

asked Nov 14 '18 at 0:34

Cae.rich

244

asked Nov 14 '18 at 0:34

Cae.rich

244

Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

– RavinderSingh13
Nov 14 '18 at 1:41

add a comment |

Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

– RavinderSingh13
Nov 14 '18 at 1:41

Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.

– RavinderSingh13
Nov 14 '18 at 1:41

add a comment |

1 Answer
1

active

oldest

votes

Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr or NationalCowID so it could be dynamic too. Could you please try following once.

awk '

BEGIN{

  FS=OFS=","

}

FNR==1{

  for(i=1;i<=NF;i++){

    if($i=="YBr"){

       field=i

    }

    if($i=="NationalCowID"){

       value=i

    }

  }

}

$field==1{

  for(i=value+1;i<=NF;i++){

       $i="*"

  }

}

1

' Input_file

Output will be as follows.

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,*,*,*,*,*,*,*,*,*

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,*,*,*,*,*,*,*,*,*

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,*,*,*,*,*,*,*,*,*

edited Nov 14 '18 at 1:56

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50

@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53

1

@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05

@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53291486%2fuse-id-file-to-replace-some-columns-in-dataset-then-print-entire-datset%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

awk '

BEGIN{

  FS=OFS=","

}

FNR==1{

  for(i=1;i<=NF;i++){

    if($i=="YBr"){

       field=i

    }

    if($i=="NationalCowID"){

       value=i

    }

  }

}

$field==1{

  for(i=value+1;i<=NF;i++){

       $i="*"

  }

}

1

' Input_file

Output will be as follows.

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,*,*,*,*,*,*,*,*,*

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,*,*,*,*,*,*,*,*,*

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,*,*,*,*,*,*,*,*,*

edited Nov 14 '18 at 1:56

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50

@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53

1

@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05

@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06

add a comment |

awk '

BEGIN{

  FS=OFS=","

}

FNR==1{

  for(i=1;i<=NF;i++){

    if($i=="YBr"){

       field=i

    }

    if($i=="NationalCowID"){

       value=i

    }

  }

}

$field==1{

  for(i=value+1;i<=NF;i++){

       $i="*"

  }

}

1

' Input_file

Output will be as follows.

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,*,*,*,*,*,*,*,*,*

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,*,*,*,*,*,*,*,*,*

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,*,*,*,*,*,*,*,*,*

edited Nov 14 '18 at 1:56

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50

@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53

1

@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05

@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06

add a comment |

awk '

BEGIN{

  FS=OFS=","

}

FNR==1{

  for(i=1;i<=NF;i++){

    if($i=="YBr"){

       field=i

    }

    if($i=="NationalCowID"){

       value=i

    }

  }

}

$field==1{

  for(i=value+1;i<=NF;i++){

       $i="*"

  }

}

1

' Input_file

Output will be as follows.

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,*,*,*,*,*,*,*,*,*

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,*,*,*,*,*,*,*,*,*

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,*,*,*,*,*,*,*,*,*

edited Nov 14 '18 at 1:56

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

awk '

BEGIN{

  FS=OFS=","

}

FNR==1{

  for(i=1;i<=NF;i++){

    if($i=="YBr"){

       field=i

    }

    if($i=="NationalCowID"){

       value=i

    }

  }

}

$field==1{

  for(i=value+1;i<=NF;i++){

       $i="*"

  }

}

1

' Input_file

Output will be as follows.

NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl

206004574,*,*,*,*,*,*,*,*,*

206004573,20141209,6,2,93,2,2014,436.4504712,NA,4

206004575,*,*,*,*,*,*,*,*,*

206004576,20141208,6,2,92,2,2014,424.4410055,NA,7

206004579,20141209,6,2,93,2,2014,436.4504712,NA,8

206004571,*,*,*,*,*,*,*,*,*

edited Nov 14 '18 at 1:56

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

edited Nov 14 '18 at 1:56

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

answered Nov 14 '18 at 1:40

RavinderSingh13

26.1k41438

Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50

@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53

1

@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05

@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06

add a comment |

Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50

@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53

1

@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05

@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06

Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?

– Cae.rich
Nov 14 '18 at 1:50

@Cae.rich, to keep the last column change for loop's value from for(i=value+1;i<=NF;i++) TO for(i=value+1;i<=(NF-1);i++) and let me know then?

– RavinderSingh13
Nov 14 '18 at 1:53

@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!

– Cae.rich
Nov 14 '18 at 2:05

@Cae.rich, your welcome.

– RavinderSingh13
Nov 14 '18 at 2:06

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk