R - linked matrices rows - make code faster












0















Here is my dilemma:




  • I am trying to price a physical asset to do this I need to simulate a lot of random paths.

  • After each path is simulated I have to fill 26 equally sized matrices that are linked to not only the price paths but also to each other

  • The problem I am having is that I am forced to go row by row in these matrices and this is really slowing down my code (13.33 seconds per simulation)

  • What I mean by the matrices are linked is that if you have 2 matrices A and B that are already initialized (so the first rows are filled with initial values) row 2 of B is a function of row 1 of A, then row 2 of A is a function of row 2 of B, then row 3 of B is a function of row 2 of A so I can't fill be without filling A and vice versa.

  • In reality the functions that link them are very very complex and the row of one matrix can be a function of the rows of up to 5 other matrices and the indices of the rows of these 5 matrices are not always as simple as row at index -1 and row at same index.

  • I am finding myself forced go through all the matrices row by row to fill them all. So I am forced to fill row 2 of matrice B then row 2 of matrice A then row 3 of matrice B then row 3 of matrice A...


I will give the following reproducible example just to make it clear what I mean by linked together and what I am doing now. Please understand that on reality the functions are way more complex and so the links of the matrices between each other as well.



a <- matrix(c(1,0,0,1,0,0,1,0,0), nrow = 3, ncol = 3)
b <- matrix(c(2,0,0,2,0,0,2,0,0), nrow = 3, ncol = 3)

function_fill_a <- function(a, b, r = 2){
a[r, ] <- a[r,] + b[r, ]
return(a)
}

function_fill_b <- function(a, b, r = 2){
b[r, ] <- a[r-1,] + 2
return(b)
}

for( i in 2:nrow(a)){
b <- function_fill_b(a, b, i)
a <- function_fill_a(a, b, i)
}


Is there a way to not be forced to use this for loop at the end because it is what is slowing down my code I am certain.



EDIT - Reply to comments



in my code this for loop is actually a function that takes a list of matrices and then it applies the functions on the matrices in the list.



so it looks like this:



m <- list("a" = a, "b"= b)

fill_all <- function(m){
for( i in 2:nrow(m$a)){
m$b <- function_fill_b(m$a, m$b, i)
m$a <- function_fill_a(m$a, m$b, i)
}
return(m)
}

m <- fill_all(m)


If I use Rcpp will I have to rewrite all the functions or only this last one ? is there a way to call these functions written in R into the cppFunction ?










share|improve this question

























  • Most methods I can think of (that might capitalize on R's efficiencies when vectoring things) don't work due to the complex dependencies. Some thoughts: Rcpp for significant speed-ups; parallelization (if you have unconnected trees of matrices); general R speed-ups (BLAS, MKL); buy a faster computer (shrug).

    – r2evans
    Nov 19 '18 at 17:31













  • The problem is similar to rolling window calculations (e.g., zoo), where there really is no way to vectorize it, since the vector is changing mid-step. (In my terms, "vectorizing" can be matrix/array-centric as well.) If this is a "production" thing, if Rcpp is daunting, you might consider other languages such as Python or Julia (though if Rcpp is scary and you aren't a language-phile, then learning Julia just for this might also be a bit steep).

    – r2evans
    Nov 19 '18 at 17:34











  • You will get some minor efficiency gains if you transpose your matrices and use columns instead of rows. R stores matrices in column-major order so referencing/passing around values in a column is tiny bit quicker than a row. You also might get a speed boost by adapting your function_fill_* to take vectors and return vectors that get assigned the the matrix rather than passing the whole matrix through. Not sure, but might be worth testing...

    – Gregor
    Nov 19 '18 at 18:43













  • However, you seem to be making a lot of unfounded assumptions. "this for loop... it is what is slowing down my code I am certain". Why are you certain? Have you profiled your code? "Please understand that on reality the functions are way more complex..." My guess is that these "way more complex" functions eat up much more time than the surrounding for loop.

    – Gregor
    Nov 19 '18 at 18:50








  • 1





    My guess is about 99% of your time is spent doing function_fill_*, and about 1% is left doing the rest. So if you work hard and convert everything else into Rcpp and speed it up 100x, you will save a little less than 1% of your overall runtime. Of course that's just a guess, but until you profile your code I think it is a good assumption. So, my recommendation would be (a) profile your code, then (b) target your improvements on the slow parts of your code. The Profiling section of Advanced R is a good place to start.

    – Gregor
    Nov 19 '18 at 18:52


















0















Here is my dilemma:




  • I am trying to price a physical asset to do this I need to simulate a lot of random paths.

  • After each path is simulated I have to fill 26 equally sized matrices that are linked to not only the price paths but also to each other

  • The problem I am having is that I am forced to go row by row in these matrices and this is really slowing down my code (13.33 seconds per simulation)

  • What I mean by the matrices are linked is that if you have 2 matrices A and B that are already initialized (so the first rows are filled with initial values) row 2 of B is a function of row 1 of A, then row 2 of A is a function of row 2 of B, then row 3 of B is a function of row 2 of A so I can't fill be without filling A and vice versa.

  • In reality the functions that link them are very very complex and the row of one matrix can be a function of the rows of up to 5 other matrices and the indices of the rows of these 5 matrices are not always as simple as row at index -1 and row at same index.

  • I am finding myself forced go through all the matrices row by row to fill them all. So I am forced to fill row 2 of matrice B then row 2 of matrice A then row 3 of matrice B then row 3 of matrice A...


I will give the following reproducible example just to make it clear what I mean by linked together and what I am doing now. Please understand that on reality the functions are way more complex and so the links of the matrices between each other as well.



a <- matrix(c(1,0,0,1,0,0,1,0,0), nrow = 3, ncol = 3)
b <- matrix(c(2,0,0,2,0,0,2,0,0), nrow = 3, ncol = 3)

function_fill_a <- function(a, b, r = 2){
a[r, ] <- a[r,] + b[r, ]
return(a)
}

function_fill_b <- function(a, b, r = 2){
b[r, ] <- a[r-1,] + 2
return(b)
}

for( i in 2:nrow(a)){
b <- function_fill_b(a, b, i)
a <- function_fill_a(a, b, i)
}


Is there a way to not be forced to use this for loop at the end because it is what is slowing down my code I am certain.



EDIT - Reply to comments



in my code this for loop is actually a function that takes a list of matrices and then it applies the functions on the matrices in the list.



so it looks like this:



m <- list("a" = a, "b"= b)

fill_all <- function(m){
for( i in 2:nrow(m$a)){
m$b <- function_fill_b(m$a, m$b, i)
m$a <- function_fill_a(m$a, m$b, i)
}
return(m)
}

m <- fill_all(m)


If I use Rcpp will I have to rewrite all the functions or only this last one ? is there a way to call these functions written in R into the cppFunction ?










share|improve this question

























  • Most methods I can think of (that might capitalize on R's efficiencies when vectoring things) don't work due to the complex dependencies. Some thoughts: Rcpp for significant speed-ups; parallelization (if you have unconnected trees of matrices); general R speed-ups (BLAS, MKL); buy a faster computer (shrug).

    – r2evans
    Nov 19 '18 at 17:31













  • The problem is similar to rolling window calculations (e.g., zoo), where there really is no way to vectorize it, since the vector is changing mid-step. (In my terms, "vectorizing" can be matrix/array-centric as well.) If this is a "production" thing, if Rcpp is daunting, you might consider other languages such as Python or Julia (though if Rcpp is scary and you aren't a language-phile, then learning Julia just for this might also be a bit steep).

    – r2evans
    Nov 19 '18 at 17:34











  • You will get some minor efficiency gains if you transpose your matrices and use columns instead of rows. R stores matrices in column-major order so referencing/passing around values in a column is tiny bit quicker than a row. You also might get a speed boost by adapting your function_fill_* to take vectors and return vectors that get assigned the the matrix rather than passing the whole matrix through. Not sure, but might be worth testing...

    – Gregor
    Nov 19 '18 at 18:43













  • However, you seem to be making a lot of unfounded assumptions. "this for loop... it is what is slowing down my code I am certain". Why are you certain? Have you profiled your code? "Please understand that on reality the functions are way more complex..." My guess is that these "way more complex" functions eat up much more time than the surrounding for loop.

    – Gregor
    Nov 19 '18 at 18:50








  • 1





    My guess is about 99% of your time is spent doing function_fill_*, and about 1% is left doing the rest. So if you work hard and convert everything else into Rcpp and speed it up 100x, you will save a little less than 1% of your overall runtime. Of course that's just a guess, but until you profile your code I think it is a good assumption. So, my recommendation would be (a) profile your code, then (b) target your improvements on the slow parts of your code. The Profiling section of Advanced R is a good place to start.

    – Gregor
    Nov 19 '18 at 18:52
















0












0








0








Here is my dilemma:




  • I am trying to price a physical asset to do this I need to simulate a lot of random paths.

  • After each path is simulated I have to fill 26 equally sized matrices that are linked to not only the price paths but also to each other

  • The problem I am having is that I am forced to go row by row in these matrices and this is really slowing down my code (13.33 seconds per simulation)

  • What I mean by the matrices are linked is that if you have 2 matrices A and B that are already initialized (so the first rows are filled with initial values) row 2 of B is a function of row 1 of A, then row 2 of A is a function of row 2 of B, then row 3 of B is a function of row 2 of A so I can't fill be without filling A and vice versa.

  • In reality the functions that link them are very very complex and the row of one matrix can be a function of the rows of up to 5 other matrices and the indices of the rows of these 5 matrices are not always as simple as row at index -1 and row at same index.

  • I am finding myself forced go through all the matrices row by row to fill them all. So I am forced to fill row 2 of matrice B then row 2 of matrice A then row 3 of matrice B then row 3 of matrice A...


I will give the following reproducible example just to make it clear what I mean by linked together and what I am doing now. Please understand that on reality the functions are way more complex and so the links of the matrices between each other as well.



a <- matrix(c(1,0,0,1,0,0,1,0,0), nrow = 3, ncol = 3)
b <- matrix(c(2,0,0,2,0,0,2,0,0), nrow = 3, ncol = 3)

function_fill_a <- function(a, b, r = 2){
a[r, ] <- a[r,] + b[r, ]
return(a)
}

function_fill_b <- function(a, b, r = 2){
b[r, ] <- a[r-1,] + 2
return(b)
}

for( i in 2:nrow(a)){
b <- function_fill_b(a, b, i)
a <- function_fill_a(a, b, i)
}


Is there a way to not be forced to use this for loop at the end because it is what is slowing down my code I am certain.



EDIT - Reply to comments



in my code this for loop is actually a function that takes a list of matrices and then it applies the functions on the matrices in the list.



so it looks like this:



m <- list("a" = a, "b"= b)

fill_all <- function(m){
for( i in 2:nrow(m$a)){
m$b <- function_fill_b(m$a, m$b, i)
m$a <- function_fill_a(m$a, m$b, i)
}
return(m)
}

m <- fill_all(m)


If I use Rcpp will I have to rewrite all the functions or only this last one ? is there a way to call these functions written in R into the cppFunction ?










share|improve this question
















Here is my dilemma:




  • I am trying to price a physical asset to do this I need to simulate a lot of random paths.

  • After each path is simulated I have to fill 26 equally sized matrices that are linked to not only the price paths but also to each other

  • The problem I am having is that I am forced to go row by row in these matrices and this is really slowing down my code (13.33 seconds per simulation)

  • What I mean by the matrices are linked is that if you have 2 matrices A and B that are already initialized (so the first rows are filled with initial values) row 2 of B is a function of row 1 of A, then row 2 of A is a function of row 2 of B, then row 3 of B is a function of row 2 of A so I can't fill be without filling A and vice versa.

  • In reality the functions that link them are very very complex and the row of one matrix can be a function of the rows of up to 5 other matrices and the indices of the rows of these 5 matrices are not always as simple as row at index -1 and row at same index.

  • I am finding myself forced go through all the matrices row by row to fill them all. So I am forced to fill row 2 of matrice B then row 2 of matrice A then row 3 of matrice B then row 3 of matrice A...


I will give the following reproducible example just to make it clear what I mean by linked together and what I am doing now. Please understand that on reality the functions are way more complex and so the links of the matrices between each other as well.



a <- matrix(c(1,0,0,1,0,0,1,0,0), nrow = 3, ncol = 3)
b <- matrix(c(2,0,0,2,0,0,2,0,0), nrow = 3, ncol = 3)

function_fill_a <- function(a, b, r = 2){
a[r, ] <- a[r,] + b[r, ]
return(a)
}

function_fill_b <- function(a, b, r = 2){
b[r, ] <- a[r-1,] + 2
return(b)
}

for( i in 2:nrow(a)){
b <- function_fill_b(a, b, i)
a <- function_fill_a(a, b, i)
}


Is there a way to not be forced to use this for loop at the end because it is what is slowing down my code I am certain.



EDIT - Reply to comments



in my code this for loop is actually a function that takes a list of matrices and then it applies the functions on the matrices in the list.



so it looks like this:



m <- list("a" = a, "b"= b)

fill_all <- function(m){
for( i in 2:nrow(m$a)){
m$b <- function_fill_b(m$a, m$b, i)
m$a <- function_fill_a(m$a, m$b, i)
}
return(m)
}

m <- fill_all(m)


If I use Rcpp will I have to rewrite all the functions or only this last one ? is there a way to call these functions written in R into the cppFunction ?







r performance matrix






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 '18 at 18:31







user9396820

















asked Nov 19 '18 at 17:08









user9396820user9396820

84




84













  • Most methods I can think of (that might capitalize on R's efficiencies when vectoring things) don't work due to the complex dependencies. Some thoughts: Rcpp for significant speed-ups; parallelization (if you have unconnected trees of matrices); general R speed-ups (BLAS, MKL); buy a faster computer (shrug).

    – r2evans
    Nov 19 '18 at 17:31













  • The problem is similar to rolling window calculations (e.g., zoo), where there really is no way to vectorize it, since the vector is changing mid-step. (In my terms, "vectorizing" can be matrix/array-centric as well.) If this is a "production" thing, if Rcpp is daunting, you might consider other languages such as Python or Julia (though if Rcpp is scary and you aren't a language-phile, then learning Julia just for this might also be a bit steep).

    – r2evans
    Nov 19 '18 at 17:34











  • You will get some minor efficiency gains if you transpose your matrices and use columns instead of rows. R stores matrices in column-major order so referencing/passing around values in a column is tiny bit quicker than a row. You also might get a speed boost by adapting your function_fill_* to take vectors and return vectors that get assigned the the matrix rather than passing the whole matrix through. Not sure, but might be worth testing...

    – Gregor
    Nov 19 '18 at 18:43













  • However, you seem to be making a lot of unfounded assumptions. "this for loop... it is what is slowing down my code I am certain". Why are you certain? Have you profiled your code? "Please understand that on reality the functions are way more complex..." My guess is that these "way more complex" functions eat up much more time than the surrounding for loop.

    – Gregor
    Nov 19 '18 at 18:50








  • 1





    My guess is about 99% of your time is spent doing function_fill_*, and about 1% is left doing the rest. So if you work hard and convert everything else into Rcpp and speed it up 100x, you will save a little less than 1% of your overall runtime. Of course that's just a guess, but until you profile your code I think it is a good assumption. So, my recommendation would be (a) profile your code, then (b) target your improvements on the slow parts of your code. The Profiling section of Advanced R is a good place to start.

    – Gregor
    Nov 19 '18 at 18:52





















  • Most methods I can think of (that might capitalize on R's efficiencies when vectoring things) don't work due to the complex dependencies. Some thoughts: Rcpp for significant speed-ups; parallelization (if you have unconnected trees of matrices); general R speed-ups (BLAS, MKL); buy a faster computer (shrug).

    – r2evans
    Nov 19 '18 at 17:31













  • The problem is similar to rolling window calculations (e.g., zoo), where there really is no way to vectorize it, since the vector is changing mid-step. (In my terms, "vectorizing" can be matrix/array-centric as well.) If this is a "production" thing, if Rcpp is daunting, you might consider other languages such as Python or Julia (though if Rcpp is scary and you aren't a language-phile, then learning Julia just for this might also be a bit steep).

    – r2evans
    Nov 19 '18 at 17:34











  • You will get some minor efficiency gains if you transpose your matrices and use columns instead of rows. R stores matrices in column-major order so referencing/passing around values in a column is tiny bit quicker than a row. You also might get a speed boost by adapting your function_fill_* to take vectors and return vectors that get assigned the the matrix rather than passing the whole matrix through. Not sure, but might be worth testing...

    – Gregor
    Nov 19 '18 at 18:43













  • However, you seem to be making a lot of unfounded assumptions. "this for loop... it is what is slowing down my code I am certain". Why are you certain? Have you profiled your code? "Please understand that on reality the functions are way more complex..." My guess is that these "way more complex" functions eat up much more time than the surrounding for loop.

    – Gregor
    Nov 19 '18 at 18:50








  • 1





    My guess is about 99% of your time is spent doing function_fill_*, and about 1% is left doing the rest. So if you work hard and convert everything else into Rcpp and speed it up 100x, you will save a little less than 1% of your overall runtime. Of course that's just a guess, but until you profile your code I think it is a good assumption. So, my recommendation would be (a) profile your code, then (b) target your improvements on the slow parts of your code. The Profiling section of Advanced R is a good place to start.

    – Gregor
    Nov 19 '18 at 18:52



















Most methods I can think of (that might capitalize on R's efficiencies when vectoring things) don't work due to the complex dependencies. Some thoughts: Rcpp for significant speed-ups; parallelization (if you have unconnected trees of matrices); general R speed-ups (BLAS, MKL); buy a faster computer (shrug).

– r2evans
Nov 19 '18 at 17:31







Most methods I can think of (that might capitalize on R's efficiencies when vectoring things) don't work due to the complex dependencies. Some thoughts: Rcpp for significant speed-ups; parallelization (if you have unconnected trees of matrices); general R speed-ups (BLAS, MKL); buy a faster computer (shrug).

– r2evans
Nov 19 '18 at 17:31















The problem is similar to rolling window calculations (e.g., zoo), where there really is no way to vectorize it, since the vector is changing mid-step. (In my terms, "vectorizing" can be matrix/array-centric as well.) If this is a "production" thing, if Rcpp is daunting, you might consider other languages such as Python or Julia (though if Rcpp is scary and you aren't a language-phile, then learning Julia just for this might also be a bit steep).

– r2evans
Nov 19 '18 at 17:34





The problem is similar to rolling window calculations (e.g., zoo), where there really is no way to vectorize it, since the vector is changing mid-step. (In my terms, "vectorizing" can be matrix/array-centric as well.) If this is a "production" thing, if Rcpp is daunting, you might consider other languages such as Python or Julia (though if Rcpp is scary and you aren't a language-phile, then learning Julia just for this might also be a bit steep).

– r2evans
Nov 19 '18 at 17:34













You will get some minor efficiency gains if you transpose your matrices and use columns instead of rows. R stores matrices in column-major order so referencing/passing around values in a column is tiny bit quicker than a row. You also might get a speed boost by adapting your function_fill_* to take vectors and return vectors that get assigned the the matrix rather than passing the whole matrix through. Not sure, but might be worth testing...

– Gregor
Nov 19 '18 at 18:43







You will get some minor efficiency gains if you transpose your matrices and use columns instead of rows. R stores matrices in column-major order so referencing/passing around values in a column is tiny bit quicker than a row. You also might get a speed boost by adapting your function_fill_* to take vectors and return vectors that get assigned the the matrix rather than passing the whole matrix through. Not sure, but might be worth testing...

– Gregor
Nov 19 '18 at 18:43















However, you seem to be making a lot of unfounded assumptions. "this for loop... it is what is slowing down my code I am certain". Why are you certain? Have you profiled your code? "Please understand that on reality the functions are way more complex..." My guess is that these "way more complex" functions eat up much more time than the surrounding for loop.

– Gregor
Nov 19 '18 at 18:50







However, you seem to be making a lot of unfounded assumptions. "this for loop... it is what is slowing down my code I am certain". Why are you certain? Have you profiled your code? "Please understand that on reality the functions are way more complex..." My guess is that these "way more complex" functions eat up much more time than the surrounding for loop.

– Gregor
Nov 19 '18 at 18:50






1




1





My guess is about 99% of your time is spent doing function_fill_*, and about 1% is left doing the rest. So if you work hard and convert everything else into Rcpp and speed it up 100x, you will save a little less than 1% of your overall runtime. Of course that's just a guess, but until you profile your code I think it is a good assumption. So, my recommendation would be (a) profile your code, then (b) target your improvements on the slow parts of your code. The Profiling section of Advanced R is a good place to start.

– Gregor
Nov 19 '18 at 18:52







My guess is about 99% of your time is spent doing function_fill_*, and about 1% is left doing the rest. So if you work hard and convert everything else into Rcpp and speed it up 100x, you will save a little less than 1% of your overall runtime. Of course that's just a guess, but until you profile your code I think it is a good assumption. So, my recommendation would be (a) profile your code, then (b) target your improvements on the slow parts of your code. The Profiling section of Advanced R is a good place to start.

– Gregor
Nov 19 '18 at 18:52














0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53379537%2fr-linked-matrices-rows-make-code-faster%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53379537%2fr-linked-matrices-rows-make-code-faster%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Tangent Lines Diagram Along Smooth Curve

Yusuf al-Mu'taman ibn Hud

Zucchini