Mapping a dataframe (with NA) to an n by n adjacency matrix (as a data.frame object)
up vote
0
down vote
favorite
I have a three-column dataframe object recording the bilateral trade data between 161 countries, the data are of dyadic format containing 19687 rows, three columns (reporter (rid), partner (pid), and their bilateral trade flow (TradeValue) in a given year). rid or pid takes a value from 1 to 161, and a country is assigned the same rid and pid. For any given pair of (rid, pid) in which rid =/= pid, TradeValue(rid, pid) = TradeValue(pid, rid).
The data (run in R) look like this:
#load the data from dropbox folder
library(foreign)
example_data <- read.csv("https://www.dropbox.com/s/hf0ga22tdjlvdvr/example_data.csv?dl=1")
head(example_data, n = 10)
rid pid TradeValue
1 2 3 500
2 2 7 2328
3 2 8 2233465
4 2 9 81470
5 2 12 572893
6 2 17 488374
7 2 19 3314932
8 2 23 20323
9 2 25 10
10 2 29 9026220The data were sourced from UN Comtrade database, each rid is paired with multiple pid to get their bilateral trade data, but as can be seen, not every pid has a numeric id value because I only assigned a rid or pid to a country if a list of relevant economic indicators of that country are available, which is why there are NA in the data despite TradeValue exists between that country and the reporting country (rid). The same applies when a country become a "reporter," in that situation, that country did not report any TradeValue with partners, and its id number is absent from the rid column. (Hence, you can see rid column begins with 2, because country 1 (i.e., Afghanistan) did not report any bilateral trade data with partners). A quick check with summary statistics helps confirm this
length(unique(example_data$rid))
[1] 139
# only 139 countries reported bilateral trade statistics with partners
length(unique(example_data$pid))
[1] 162
# that extra pid is NA (161 + NA = 162)Since most countries report bilateral trade data with partners and for those who don't, they tend to be small economies. Hence, I want to preserve the complete list of 161 countries and transform this example_data dataframe into a 161 x 161 adjacency matrix in which
- for those countries that are absent from the
ridcolumn (e.g.,rid== 1), create each of them a row and set the entire row (in the 161 x 161 matrix) to 0. - for those countries (
pid) that do not shareTradeValueentries with a particularrid, set those cells to 0.
For example, suppose in a 5 x 5 adjacency matrix, country 1 did not report any trade statistics with partners, the other four reported their bilateral trade statistics with other (except country 1). The original dataframe is like
rid pid TradeValue
2 3 223
2 4 13
2 5 9
3 2 223
3 4 57
3 5 28
4 2 13
4 3 57
4 5 82
5 2 9
5 3 28
5 4 82from which I want to convert it to a 5 x 5 adjacency matrix (of data.frame format), the desired output should look like this
V1 V2 V3 V4 V5
1 0 0 0 0 0
2 0 0 223 13 9
3 0 223 0 57 28
4 0 13 57 0 82
5 0 9 28 82 0And using the same method on the example_data to create a 161 x 161 adjacency matrix. However, after a couple trial and error with reshape and other methods, I still could not get around with such conversion, not even beyond the first step.
It will be really appreciated if anyone could enlighten me on this?
r dataframe reshape adjacency-matrix
add a comment |
up vote
0
down vote
favorite
I have a three-column dataframe object recording the bilateral trade data between 161 countries, the data are of dyadic format containing 19687 rows, three columns (reporter (rid), partner (pid), and their bilateral trade flow (TradeValue) in a given year). rid or pid takes a value from 1 to 161, and a country is assigned the same rid and pid. For any given pair of (rid, pid) in which rid =/= pid, TradeValue(rid, pid) = TradeValue(pid, rid).
The data (run in R) look like this:
#load the data from dropbox folder
library(foreign)
example_data <- read.csv("https://www.dropbox.com/s/hf0ga22tdjlvdvr/example_data.csv?dl=1")
head(example_data, n = 10)
rid pid TradeValue
1 2 3 500
2 2 7 2328
3 2 8 2233465
4 2 9 81470
5 2 12 572893
6 2 17 488374
7 2 19 3314932
8 2 23 20323
9 2 25 10
10 2 29 9026220The data were sourced from UN Comtrade database, each rid is paired with multiple pid to get their bilateral trade data, but as can be seen, not every pid has a numeric id value because I only assigned a rid or pid to a country if a list of relevant economic indicators of that country are available, which is why there are NA in the data despite TradeValue exists between that country and the reporting country (rid). The same applies when a country become a "reporter," in that situation, that country did not report any TradeValue with partners, and its id number is absent from the rid column. (Hence, you can see rid column begins with 2, because country 1 (i.e., Afghanistan) did not report any bilateral trade data with partners). A quick check with summary statistics helps confirm this
length(unique(example_data$rid))
[1] 139
# only 139 countries reported bilateral trade statistics with partners
length(unique(example_data$pid))
[1] 162
# that extra pid is NA (161 + NA = 162)Since most countries report bilateral trade data with partners and for those who don't, they tend to be small economies. Hence, I want to preserve the complete list of 161 countries and transform this example_data dataframe into a 161 x 161 adjacency matrix in which
- for those countries that are absent from the
ridcolumn (e.g.,rid== 1), create each of them a row and set the entire row (in the 161 x 161 matrix) to 0. - for those countries (
pid) that do not shareTradeValueentries with a particularrid, set those cells to 0.
For example, suppose in a 5 x 5 adjacency matrix, country 1 did not report any trade statistics with partners, the other four reported their bilateral trade statistics with other (except country 1). The original dataframe is like
rid pid TradeValue
2 3 223
2 4 13
2 5 9
3 2 223
3 4 57
3 5 28
4 2 13
4 3 57
4 5 82
5 2 9
5 3 28
5 4 82from which I want to convert it to a 5 x 5 adjacency matrix (of data.frame format), the desired output should look like this
V1 V2 V3 V4 V5
1 0 0 0 0 0
2 0 0 223 13 9
3 0 223 0 57 28
4 0 13 57 0 82
5 0 9 28 82 0And using the same method on the example_data to create a 161 x 161 adjacency matrix. However, after a couple trial and error with reshape and other methods, I still could not get around with such conversion, not even beyond the first step.
It will be really appreciated if anyone could enlighten me on this?
r dataframe reshape adjacency-matrix
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a three-column dataframe object recording the bilateral trade data between 161 countries, the data are of dyadic format containing 19687 rows, three columns (reporter (rid), partner (pid), and their bilateral trade flow (TradeValue) in a given year). rid or pid takes a value from 1 to 161, and a country is assigned the same rid and pid. For any given pair of (rid, pid) in which rid =/= pid, TradeValue(rid, pid) = TradeValue(pid, rid).
The data (run in R) look like this:
#load the data from dropbox folder
library(foreign)
example_data <- read.csv("https://www.dropbox.com/s/hf0ga22tdjlvdvr/example_data.csv?dl=1")
head(example_data, n = 10)
rid pid TradeValue
1 2 3 500
2 2 7 2328
3 2 8 2233465
4 2 9 81470
5 2 12 572893
6 2 17 488374
7 2 19 3314932
8 2 23 20323
9 2 25 10
10 2 29 9026220The data were sourced from UN Comtrade database, each rid is paired with multiple pid to get their bilateral trade data, but as can be seen, not every pid has a numeric id value because I only assigned a rid or pid to a country if a list of relevant economic indicators of that country are available, which is why there are NA in the data despite TradeValue exists between that country and the reporting country (rid). The same applies when a country become a "reporter," in that situation, that country did not report any TradeValue with partners, and its id number is absent from the rid column. (Hence, you can see rid column begins with 2, because country 1 (i.e., Afghanistan) did not report any bilateral trade data with partners). A quick check with summary statistics helps confirm this
length(unique(example_data$rid))
[1] 139
# only 139 countries reported bilateral trade statistics with partners
length(unique(example_data$pid))
[1] 162
# that extra pid is NA (161 + NA = 162)Since most countries report bilateral trade data with partners and for those who don't, they tend to be small economies. Hence, I want to preserve the complete list of 161 countries and transform this example_data dataframe into a 161 x 161 adjacency matrix in which
- for those countries that are absent from the
ridcolumn (e.g.,rid== 1), create each of them a row and set the entire row (in the 161 x 161 matrix) to 0. - for those countries (
pid) that do not shareTradeValueentries with a particularrid, set those cells to 0.
For example, suppose in a 5 x 5 adjacency matrix, country 1 did not report any trade statistics with partners, the other four reported their bilateral trade statistics with other (except country 1). The original dataframe is like
rid pid TradeValue
2 3 223
2 4 13
2 5 9
3 2 223
3 4 57
3 5 28
4 2 13
4 3 57
4 5 82
5 2 9
5 3 28
5 4 82from which I want to convert it to a 5 x 5 adjacency matrix (of data.frame format), the desired output should look like this
V1 V2 V3 V4 V5
1 0 0 0 0 0
2 0 0 223 13 9
3 0 223 0 57 28
4 0 13 57 0 82
5 0 9 28 82 0And using the same method on the example_data to create a 161 x 161 adjacency matrix. However, after a couple trial and error with reshape and other methods, I still could not get around with such conversion, not even beyond the first step.
It will be really appreciated if anyone could enlighten me on this?
r dataframe reshape adjacency-matrix
I have a three-column dataframe object recording the bilateral trade data between 161 countries, the data are of dyadic format containing 19687 rows, three columns (reporter (rid), partner (pid), and their bilateral trade flow (TradeValue) in a given year). rid or pid takes a value from 1 to 161, and a country is assigned the same rid and pid. For any given pair of (rid, pid) in which rid =/= pid, TradeValue(rid, pid) = TradeValue(pid, rid).
The data (run in R) look like this:
#load the data from dropbox folder
library(foreign)
example_data <- read.csv("https://www.dropbox.com/s/hf0ga22tdjlvdvr/example_data.csv?dl=1")
head(example_data, n = 10)
rid pid TradeValue
1 2 3 500
2 2 7 2328
3 2 8 2233465
4 2 9 81470
5 2 12 572893
6 2 17 488374
7 2 19 3314932
8 2 23 20323
9 2 25 10
10 2 29 9026220The data were sourced from UN Comtrade database, each rid is paired with multiple pid to get their bilateral trade data, but as can be seen, not every pid has a numeric id value because I only assigned a rid or pid to a country if a list of relevant economic indicators of that country are available, which is why there are NA in the data despite TradeValue exists between that country and the reporting country (rid). The same applies when a country become a "reporter," in that situation, that country did not report any TradeValue with partners, and its id number is absent from the rid column. (Hence, you can see rid column begins with 2, because country 1 (i.e., Afghanistan) did not report any bilateral trade data with partners). A quick check with summary statistics helps confirm this
length(unique(example_data$rid))
[1] 139
# only 139 countries reported bilateral trade statistics with partners
length(unique(example_data$pid))
[1] 162
# that extra pid is NA (161 + NA = 162)Since most countries report bilateral trade data with partners and for those who don't, they tend to be small economies. Hence, I want to preserve the complete list of 161 countries and transform this example_data dataframe into a 161 x 161 adjacency matrix in which
- for those countries that are absent from the
ridcolumn (e.g.,rid== 1), create each of them a row and set the entire row (in the 161 x 161 matrix) to 0. - for those countries (
pid) that do not shareTradeValueentries with a particularrid, set those cells to 0.
For example, suppose in a 5 x 5 adjacency matrix, country 1 did not report any trade statistics with partners, the other four reported their bilateral trade statistics with other (except country 1). The original dataframe is like
rid pid TradeValue
2 3 223
2 4 13
2 5 9
3 2 223
3 4 57
3 5 28
4 2 13
4 3 57
4 5 82
5 2 9
5 3 28
5 4 82from which I want to convert it to a 5 x 5 adjacency matrix (of data.frame format), the desired output should look like this
V1 V2 V3 V4 V5
1 0 0 0 0 0
2 0 0 223 13 9
3 0 223 0 57 28
4 0 13 57 0 82
5 0 9 28 82 0And using the same method on the example_data to create a 161 x 161 adjacency matrix. However, after a couple trial and error with reshape and other methods, I still could not get around with such conversion, not even beyond the first step.
It will be really appreciated if anyone could enlighten me on this?
#load the data from dropbox folder
library(foreign)
example_data <- read.csv("https://www.dropbox.com/s/hf0ga22tdjlvdvr/example_data.csv?dl=1")
head(example_data, n = 10)
rid pid TradeValue
1 2 3 500
2 2 7 2328
3 2 8 2233465
4 2 9 81470
5 2 12 572893
6 2 17 488374
7 2 19 3314932
8 2 23 20323
9 2 25 10
10 2 29 9026220#load the data from dropbox folder
library(foreign)
example_data <- read.csv("https://www.dropbox.com/s/hf0ga22tdjlvdvr/example_data.csv?dl=1")
head(example_data, n = 10)
rid pid TradeValue
1 2 3 500
2 2 7 2328
3 2 8 2233465
4 2 9 81470
5 2 12 572893
6 2 17 488374
7 2 19 3314932
8 2 23 20323
9 2 25 10
10 2 29 9026220length(unique(example_data$rid))
[1] 139
# only 139 countries reported bilateral trade statistics with partners
length(unique(example_data$pid))
[1] 162
# that extra pid is NA (161 + NA = 162)length(unique(example_data$rid))
[1] 139
# only 139 countries reported bilateral trade statistics with partners
length(unique(example_data$pid))
[1] 162
# that extra pid is NA (161 + NA = 162)rid pid TradeValue
2 3 223
2 4 13
2 5 9
3 2 223
3 4 57
3 5 28
4 2 13
4 3 57
4 5 82
5 2 9
5 3 28
5 4 82rid pid TradeValue
2 3 223
2 4 13
2 5 9
3 2 223
3 4 57
3 5 28
4 2 13
4 3 57
4 5 82
5 2 9
5 3 28
5 4 82 V1 V2 V3 V4 V5
1 0 0 0 0 0
2 0 0 223 13 9
3 0 223 0 57 28
4 0 13 57 0 82
5 0 9 28 82 0 V1 V2 V3 V4 V5
1 0 0 0 0 0
2 0 0 223 13 9
3 0 223 0 57 28
4 0 13 57 0 82
5 0 9 28 82 0r dataframe reshape adjacency-matrix
r dataframe reshape adjacency-matrix
edited Nov 8 at 11:32
asked Nov 7 at 14:17
Chris T.
347316
347316
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
I cannot read the dropbox file but have tried to work off of your 5-country example dataframe -
country_num = 5
# check countries missing in rid and pid
rid_miss = setdiff(1:country_num, example_data$rid)
pid_miss = ifelse(length(setdiff(1:country_num, example_data$pid) == 0),
1, setdiff(1:country_num, example_data$pid))
# create dummy dataframe with missing rid and pid
add_data = as.data.frame(do.call(cbind, list(rid_miss, pid_miss, NA)))
colnames(add_data) = colnames(example_data)
# add dummy dataframe to original
example_data = rbind(example_data, add_data)
# the dcast now takes missing rid and pid into account
mat = dcast(example_data, rid ~ pid, value.var = "TradeValue")
# can remove first column without setting colnames but this is more failproof
rownames(mat) = mat[, 1]
mat = as.matrix(mat[, -1])
# fill in upper triangular matrix with missing values of lower triangular matrix
# and vice-versa since TradeValue(rid, pid) = TradeValue(pid, rid)
mat[is.na(mat)] = t(mat)[is.na(mat)]
# change NAs to 0 according to preference - would keep as NA to differentiate
# from actual zeros
mat[is.na(mat)] = 0
Does this help?
Just a quick touch base. Thanks so much for your response, I will look at your code after get back from work and Dropbox folder link has been updated.
– Chris T.
Nov 7 at 16:46
I applied your data to my data, but it didn't work, becauseadd_datais of different length thanexample_data, the former does not have the third column (TradeValue).
– Chris T.
Nov 8 at 2:15
It's because in myexample_data, thepidcontains complete list of all country id, if you check that withpid_miss = setdiff(1:country_num, example_data$pid),length(pid_miss)is 0. Puttingpid_missintolist(rid_miss, pid_miss, NA)will create anadd_dataobject of only 2 columns, which is not of the same length asexample_data.
– Chris T.
Nov 8 at 2:32
I know it is a big dataset but can youdputit? I am still unable to access the dropbox file, and I need it to understand the data better. Your 5 x 5 example had1missing from bothridandpidand so I assumed that was your dataset was like.
– phil_t
Nov 8 at 13:46
Thanks again for your response, but the dropbox link should work by now. The data contain 15602 rows, so it's hard todputhere.
– Chris T.
Nov 8 at 13:49
|
show 2 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
I cannot read the dropbox file but have tried to work off of your 5-country example dataframe -
country_num = 5
# check countries missing in rid and pid
rid_miss = setdiff(1:country_num, example_data$rid)
pid_miss = ifelse(length(setdiff(1:country_num, example_data$pid) == 0),
1, setdiff(1:country_num, example_data$pid))
# create dummy dataframe with missing rid and pid
add_data = as.data.frame(do.call(cbind, list(rid_miss, pid_miss, NA)))
colnames(add_data) = colnames(example_data)
# add dummy dataframe to original
example_data = rbind(example_data, add_data)
# the dcast now takes missing rid and pid into account
mat = dcast(example_data, rid ~ pid, value.var = "TradeValue")
# can remove first column without setting colnames but this is more failproof
rownames(mat) = mat[, 1]
mat = as.matrix(mat[, -1])
# fill in upper triangular matrix with missing values of lower triangular matrix
# and vice-versa since TradeValue(rid, pid) = TradeValue(pid, rid)
mat[is.na(mat)] = t(mat)[is.na(mat)]
# change NAs to 0 according to preference - would keep as NA to differentiate
# from actual zeros
mat[is.na(mat)] = 0
Does this help?
Just a quick touch base. Thanks so much for your response, I will look at your code after get back from work and Dropbox folder link has been updated.
– Chris T.
Nov 7 at 16:46
I applied your data to my data, but it didn't work, becauseadd_datais of different length thanexample_data, the former does not have the third column (TradeValue).
– Chris T.
Nov 8 at 2:15
It's because in myexample_data, thepidcontains complete list of all country id, if you check that withpid_miss = setdiff(1:country_num, example_data$pid),length(pid_miss)is 0. Puttingpid_missintolist(rid_miss, pid_miss, NA)will create anadd_dataobject of only 2 columns, which is not of the same length asexample_data.
– Chris T.
Nov 8 at 2:32
I know it is a big dataset but can youdputit? I am still unable to access the dropbox file, and I need it to understand the data better. Your 5 x 5 example had1missing from bothridandpidand so I assumed that was your dataset was like.
– phil_t
Nov 8 at 13:46
Thanks again for your response, but the dropbox link should work by now. The data contain 15602 rows, so it's hard todputhere.
– Chris T.
Nov 8 at 13:49
|
show 2 more comments
up vote
1
down vote
accepted
I cannot read the dropbox file but have tried to work off of your 5-country example dataframe -
country_num = 5
# check countries missing in rid and pid
rid_miss = setdiff(1:country_num, example_data$rid)
pid_miss = ifelse(length(setdiff(1:country_num, example_data$pid) == 0),
1, setdiff(1:country_num, example_data$pid))
# create dummy dataframe with missing rid and pid
add_data = as.data.frame(do.call(cbind, list(rid_miss, pid_miss, NA)))
colnames(add_data) = colnames(example_data)
# add dummy dataframe to original
example_data = rbind(example_data, add_data)
# the dcast now takes missing rid and pid into account
mat = dcast(example_data, rid ~ pid, value.var = "TradeValue")
# can remove first column without setting colnames but this is more failproof
rownames(mat) = mat[, 1]
mat = as.matrix(mat[, -1])
# fill in upper triangular matrix with missing values of lower triangular matrix
# and vice-versa since TradeValue(rid, pid) = TradeValue(pid, rid)
mat[is.na(mat)] = t(mat)[is.na(mat)]
# change NAs to 0 according to preference - would keep as NA to differentiate
# from actual zeros
mat[is.na(mat)] = 0
Does this help?
Just a quick touch base. Thanks so much for your response, I will look at your code after get back from work and Dropbox folder link has been updated.
– Chris T.
Nov 7 at 16:46
I applied your data to my data, but it didn't work, becauseadd_datais of different length thanexample_data, the former does not have the third column (TradeValue).
– Chris T.
Nov 8 at 2:15
It's because in myexample_data, thepidcontains complete list of all country id, if you check that withpid_miss = setdiff(1:country_num, example_data$pid),length(pid_miss)is 0. Puttingpid_missintolist(rid_miss, pid_miss, NA)will create anadd_dataobject of only 2 columns, which is not of the same length asexample_data.
– Chris T.
Nov 8 at 2:32
I know it is a big dataset but can youdputit? I am still unable to access the dropbox file, and I need it to understand the data better. Your 5 x 5 example had1missing from bothridandpidand so I assumed that was your dataset was like.
– phil_t
Nov 8 at 13:46
Thanks again for your response, but the dropbox link should work by now. The data contain 15602 rows, so it's hard todputhere.
– Chris T.
Nov 8 at 13:49
|
show 2 more comments
up vote
1
down vote
accepted
up vote
1
down vote
accepted
I cannot read the dropbox file but have tried to work off of your 5-country example dataframe -
country_num = 5
# check countries missing in rid and pid
rid_miss = setdiff(1:country_num, example_data$rid)
pid_miss = ifelse(length(setdiff(1:country_num, example_data$pid) == 0),
1, setdiff(1:country_num, example_data$pid))
# create dummy dataframe with missing rid and pid
add_data = as.data.frame(do.call(cbind, list(rid_miss, pid_miss, NA)))
colnames(add_data) = colnames(example_data)
# add dummy dataframe to original
example_data = rbind(example_data, add_data)
# the dcast now takes missing rid and pid into account
mat = dcast(example_data, rid ~ pid, value.var = "TradeValue")
# can remove first column without setting colnames but this is more failproof
rownames(mat) = mat[, 1]
mat = as.matrix(mat[, -1])
# fill in upper triangular matrix with missing values of lower triangular matrix
# and vice-versa since TradeValue(rid, pid) = TradeValue(pid, rid)
mat[is.na(mat)] = t(mat)[is.na(mat)]
# change NAs to 0 according to preference - would keep as NA to differentiate
# from actual zeros
mat[is.na(mat)] = 0
Does this help?
I cannot read the dropbox file but have tried to work off of your 5-country example dataframe -
country_num = 5
# check countries missing in rid and pid
rid_miss = setdiff(1:country_num, example_data$rid)
pid_miss = ifelse(length(setdiff(1:country_num, example_data$pid) == 0),
1, setdiff(1:country_num, example_data$pid))
# create dummy dataframe with missing rid and pid
add_data = as.data.frame(do.call(cbind, list(rid_miss, pid_miss, NA)))
colnames(add_data) = colnames(example_data)
# add dummy dataframe to original
example_data = rbind(example_data, add_data)
# the dcast now takes missing rid and pid into account
mat = dcast(example_data, rid ~ pid, value.var = "TradeValue")
# can remove first column without setting colnames but this is more failproof
rownames(mat) = mat[, 1]
mat = as.matrix(mat[, -1])
# fill in upper triangular matrix with missing values of lower triangular matrix
# and vice-versa since TradeValue(rid, pid) = TradeValue(pid, rid)
mat[is.na(mat)] = t(mat)[is.na(mat)]
# change NAs to 0 according to preference - would keep as NA to differentiate
# from actual zeros
mat[is.na(mat)] = 0
Does this help?
edited Nov 8 at 14:43
answered Nov 7 at 15:41
phil_t
412210
412210
Just a quick touch base. Thanks so much for your response, I will look at your code after get back from work and Dropbox folder link has been updated.
– Chris T.
Nov 7 at 16:46
I applied your data to my data, but it didn't work, becauseadd_datais of different length thanexample_data, the former does not have the third column (TradeValue).
– Chris T.
Nov 8 at 2:15
It's because in myexample_data, thepidcontains complete list of all country id, if you check that withpid_miss = setdiff(1:country_num, example_data$pid),length(pid_miss)is 0. Puttingpid_missintolist(rid_miss, pid_miss, NA)will create anadd_dataobject of only 2 columns, which is not of the same length asexample_data.
– Chris T.
Nov 8 at 2:32
I know it is a big dataset but can youdputit? I am still unable to access the dropbox file, and I need it to understand the data better. Your 5 x 5 example had1missing from bothridandpidand so I assumed that was your dataset was like.
– phil_t
Nov 8 at 13:46
Thanks again for your response, but the dropbox link should work by now. The data contain 15602 rows, so it's hard todputhere.
– Chris T.
Nov 8 at 13:49
|
show 2 more comments
Just a quick touch base. Thanks so much for your response, I will look at your code after get back from work and Dropbox folder link has been updated.
– Chris T.
Nov 7 at 16:46
I applied your data to my data, but it didn't work, becauseadd_datais of different length thanexample_data, the former does not have the third column (TradeValue).
– Chris T.
Nov 8 at 2:15
It's because in myexample_data, thepidcontains complete list of all country id, if you check that withpid_miss = setdiff(1:country_num, example_data$pid),length(pid_miss)is 0. Puttingpid_missintolist(rid_miss, pid_miss, NA)will create anadd_dataobject of only 2 columns, which is not of the same length asexample_data.
– Chris T.
Nov 8 at 2:32
I know it is a big dataset but can youdputit? I am still unable to access the dropbox file, and I need it to understand the data better. Your 5 x 5 example had1missing from bothridandpidand so I assumed that was your dataset was like.
– phil_t
Nov 8 at 13:46
Thanks again for your response, but the dropbox link should work by now. The data contain 15602 rows, so it's hard todputhere.
– Chris T.
Nov 8 at 13:49
Just a quick touch base. Thanks so much for your response, I will look at your code after get back from work and Dropbox folder link has been updated.
– Chris T.
Nov 7 at 16:46
Just a quick touch base. Thanks so much for your response, I will look at your code after get back from work and Dropbox folder link has been updated.
– Chris T.
Nov 7 at 16:46
I applied your data to my data, but it didn't work, because
add_data is of different length than example_data, the former does not have the third column (TradeValue).– Chris T.
Nov 8 at 2:15
I applied your data to my data, but it didn't work, because
add_data is of different length than example_data, the former does not have the third column (TradeValue).– Chris T.
Nov 8 at 2:15
It's because in my
example_data, the pid contains complete list of all country id, if you check that with pid_miss = setdiff(1:country_num, example_data$pid), length(pid_miss) is 0. Putting pid_miss into list(rid_miss, pid_miss, NA) will create an add_data object of only 2 columns, which is not of the same length as example_data.– Chris T.
Nov 8 at 2:32
It's because in my
example_data, the pid contains complete list of all country id, if you check that with pid_miss = setdiff(1:country_num, example_data$pid), length(pid_miss) is 0. Putting pid_miss into list(rid_miss, pid_miss, NA) will create an add_data object of only 2 columns, which is not of the same length as example_data.– Chris T.
Nov 8 at 2:32
I know it is a big dataset but can you
dput it? I am still unable to access the dropbox file, and I need it to understand the data better. Your 5 x 5 example had 1 missing from both rid and pid and so I assumed that was your dataset was like.– phil_t
Nov 8 at 13:46
I know it is a big dataset but can you
dput it? I am still unable to access the dropbox file, and I need it to understand the data better. Your 5 x 5 example had 1 missing from both rid and pid and so I assumed that was your dataset was like.– phil_t
Nov 8 at 13:46
Thanks again for your response, but the dropbox link should work by now. The data contain 15602 rows, so it's hard to
dput here.– Chris T.
Nov 8 at 13:49
Thanks again for your response, but the dropbox link should work by now. The data contain 15602 rows, so it's hard to
dput here.– Chris T.
Nov 8 at 13:49
|
show 2 more comments
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53191265%2fmapping-a-dataframe-with-na-to-an-n-by-n-adjacency-matrix-as-a-data-frame-obj%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown