Use ID file to replace some columns in dataset then print entire datset
Still new to coding and having difficulty connecting commands to get the required output.
I have a data file in the following format:
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9
In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.
This is what I am trying to achieve
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9
I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:
awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2
Any direction would be really appreciated!
dataframe awk sed columnsorting
add a comment |
Still new to coding and having difficulty connecting commands to get the required output.
I have a data file in the following format:
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9
In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.
This is what I am trying to achieve
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9
I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:
awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2
Any direction would be really appreciated!
dataframe awk sed columnsorting
Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.
– RavinderSingh13
Nov 14 '18 at 1:41
add a comment |
Still new to coding and having difficulty connecting commands to get the required output.
I have a data file in the following format:
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9
In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.
This is what I am trying to achieve
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9
I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:
awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2
Any direction would be really appreciated!
dataframe awk sed columnsorting
Still new to coding and having difficulty connecting commands to get the required output.
I have a data file in the following format:
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9
In the data, I would like to find the rows where the variable YBr == 1 and replace all columns in that row with * except for the NationalCowID or qtl. Then print the entire dataset.
This is what I am trying to achieve
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,*,*,*,*,*,*,*,*,9
I've figured out to use sed and awk, but am struggling to go beyond the basics and use commands in combination:
awk -F ',' '{ if ($3 == 1) sed '{s/$0/*/g}' print $0}'}' file1 > file2
Any direction would be really appreciated!
dataframe awk sed columnsorting
dataframe awk sed columnsorting
edited Nov 14 '18 at 1:58
Cae.rich
asked Nov 14 '18 at 0:34
Cae.richCae.rich
244
244
Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.
– RavinderSingh13
Nov 14 '18 at 1:41
add a comment |
Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.
– RavinderSingh13
Nov 14 '18 at 1:41
Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.
– RavinderSingh13
Nov 14 '18 at 1:41
Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.
– RavinderSingh13
Nov 14 '18 at 1:41
add a comment |
1 Answer
1
active
oldest
votes
Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr
or NationalCowID
so it could be dynamic too. Could you please try following once.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file
Output will be as follows.
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*
Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?
– Cae.rich
Nov 14 '18 at 1:50
@Cae.rich, to keep the last column changefor
loop's value fromfor(i=value+1;i<=NF;i++)
TOfor(i=value+1;i<=(NF-1);i++)
and let me know then?
– RavinderSingh13
Nov 14 '18 at 1:53
1
@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!
– Cae.rich
Nov 14 '18 at 2:05
@Cae.rich, your welcome.
– RavinderSingh13
Nov 14 '18 at 2:06
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53291486%2fuse-id-file-to-replace-some-columns-in-dataset-then-print-entire-datset%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr
or NationalCowID
so it could be dynamic too. Could you please try following once.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file
Output will be as follows.
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*
Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?
– Cae.rich
Nov 14 '18 at 1:50
@Cae.rich, to keep the last column changefor
loop's value fromfor(i=value+1;i<=NF;i++)
TOfor(i=value+1;i<=(NF-1);i++)
and let me know then?
– RavinderSingh13
Nov 14 '18 at 1:53
1
@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!
– Cae.rich
Nov 14 '18 at 2:05
@Cae.rich, your welcome.
– RavinderSingh13
Nov 14 '18 at 2:06
add a comment |
Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr
or NationalCowID
so it could be dynamic too. Could you please try following once.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file
Output will be as follows.
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*
Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?
– Cae.rich
Nov 14 '18 at 1:50
@Cae.rich, to keep the last column changefor
loop's value fromfor(i=value+1;i<=NF;i++)
TOfor(i=value+1;i<=(NF-1);i++)
and let me know then?
– RavinderSingh13
Nov 14 '18 at 1:53
1
@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!
– Cae.rich
Nov 14 '18 at 2:05
@Cae.rich, your welcome.
– RavinderSingh13
Nov 14 '18 at 2:06
add a comment |
Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr
or NationalCowID
so it could be dynamic too. Could you please try following once.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file
Output will be as follows.
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*
Since you didn't show us sample expected output so can't be sure about following. By this solution I am not hard coding field values of strings YBr
or NationalCowID
so it could be dynamic too. Could you please try following once.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' Input_file
Output will be as follows.
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*
edited Nov 14 '18 at 1:56
answered Nov 14 '18 at 1:40
RavinderSingh13RavinderSingh13
26.1k41438
26.1k41438
Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?
– Cae.rich
Nov 14 '18 at 1:50
@Cae.rich, to keep the last column changefor
loop's value fromfor(i=value+1;i<=NF;i++)
TOfor(i=value+1;i<=(NF-1);i++)
and let me know then?
– RavinderSingh13
Nov 14 '18 at 1:53
1
@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!
– Cae.rich
Nov 14 '18 at 2:05
@Cae.rich, your welcome.
– RavinderSingh13
Nov 14 '18 at 2:06
add a comment |
Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?
– Cae.rich
Nov 14 '18 at 1:50
@Cae.rich, to keep the last column changefor
loop's value fromfor(i=value+1;i<=NF;i++)
TOfor(i=value+1;i<=(NF-1);i++)
and let me know then?
– RavinderSingh13
Nov 14 '18 at 1:53
1
@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!
– Cae.rich
Nov 14 '18 at 2:05
@Cae.rich, your welcome.
– RavinderSingh13
Nov 14 '18 at 2:06
Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?
– Cae.rich
Nov 14 '18 at 1:50
Thank-you for your help! I will update my questions o this it includes an output file. I would like to keep the last column (qtl) as well as the first (NationalCowID). Which part of the code tells you what columns you are keeping or changing?
– Cae.rich
Nov 14 '18 at 1:50
@Cae.rich, to keep the last column change
for
loop's value from for(i=value+1;i<=NF;i++)
TO for(i=value+1;i<=(NF-1);i++)
and let me know then?– RavinderSingh13
Nov 14 '18 at 1:53
@Cae.rich, to keep the last column change
for
loop's value from for(i=value+1;i<=NF;i++)
TO for(i=value+1;i<=(NF-1);i++)
and let me know then?– RavinderSingh13
Nov 14 '18 at 1:53
1
1
@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!
– Cae.rich
Nov 14 '18 at 2:05
@ RavinderSingh13 Worked perfectly! Thank-you so much for your help!
– Cae.rich
Nov 14 '18 at 2:05
@Cae.rich, your welcome.
– RavinderSingh13
Nov 14 '18 at 2:06
@Cae.rich, your welcome.
– RavinderSingh13
Nov 14 '18 at 2:06
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53291486%2fuse-id-file-to-replace-some-columns-in-dataset-then-print-entire-datset%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Good that you have shown us your efforts in order to solve this question, always show expected output too in code tags.
– RavinderSingh13
Nov 14 '18 at 1:41