Performance cost: No loop fusion across function barriers

up vote
1
down vote

favorite

For style and performance considerations, I found myself comparing the following two functions. Is it possible to get equivalent performance between the following two ways to add 1 to every element in an array?

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(ar)

    return(ar .+ 1.)

end



function inplace!(ar)

    ar .= add1(ar)

end



ar1 = rand(10000)

ar2 = ar1[:]



@time inplaceadd1!(ar2)

#0.000010 seconds (4 allocations: 160 bytes)

@time inplace!(ar1)

#0.000026 seconds (6 allocations: 78.359 KiB)

Not knowing too much about compiler optimizations, to me it seems that add1 could be inlined into inplace! and the loop could be fused to achieve identical performance without extra allocations. Does this not occur?

Appreciate the insight and any recommendations.

asked Nov 5 at 3:33

arch1190

add a comment |

up vote
1
down vote

favorite

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(ar)

    return(ar .+ 1.)

end



function inplace!(ar)

    ar .= add1(ar)

end



ar1 = rand(10000)

ar2 = ar1[:]



@time inplaceadd1!(ar2)

#0.000010 seconds (4 allocations: 160 bytes)

@time inplace!(ar1)

#0.000026 seconds (6 allocations: 78.359 KiB)

Appreciate the insight and any recommendations.

asked Nov 5 at 3:33

arch1190

add a comment |

up vote
1
down vote

favorite

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(ar)

    return(ar .+ 1.)

end



function inplace!(ar)

    ar .= add1(ar)

end



ar1 = rand(10000)

ar2 = ar1[:]



@time inplaceadd1!(ar2)

#0.000010 seconds (4 allocations: 160 bytes)

@time inplace!(ar1)

#0.000026 seconds (6 allocations: 78.359 KiB)

Appreciate the insight and any recommendations.

asked Nov 5 at 3:33

arch1190

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(ar)

    return(ar .+ 1.)

end



function inplace!(ar)

    ar .= add1(ar)

end



ar1 = rand(10000)

ar2 = ar1[:]



@time inplaceadd1!(ar2)

#0.000010 seconds (4 allocations: 160 bytes)

@time inplace!(ar1)

#0.000026 seconds (6 allocations: 78.359 KiB)

Appreciate the insight and any recommendations.

julia-lang

asked Nov 5 at 3:33

arch1190

asked Nov 5 at 3:33

arch1190

asked Nov 5 at 3:33

arch1190

asked Nov 5 at 3:33

arch1190

asked Nov 5 at 3:33

arch1190

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

It does not occur in your case. add1 normally returns a new array and the compiler is not able to figure out the new array is not necessary at all. Note that ! is used for style purposes and does not mean anything special to the compiler at the moment.

You should instead write your function element-wise and let the loop fusion do its work. This is a more Julia way if you are defining element-wise operations.

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(a)

    a + 1. # no `.+` here

end



function inplace!(ar)

    ar .= add1.(ar)

end

Since it is a small function, it should automatically get inlined by the compiler. You can also give a hint to the compiler by using @inline macro (annotate your function with @inline.)

@btime inplaceadd1!($ar2)

# 1.198 μs (0 allocations: 0 bytes)

@btime inplace!($ar1)

# 1.155 μs (0 allocations: 0 bytes)

edited Nov 5 at 4:52

answered Nov 5 at 4:37

hckr

1,359718

Ah, smart. Thanks, just what I was looking for.
– arch1190
Nov 5 at 4:52

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53147964%2fperformance-cost-no-loop-fusion-across-function-barriers%23new-answer', 'question_page');
}
);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

You should instead write your function element-wise and let the loop fusion do its work. This is a more Julia way if you are defining element-wise operations.

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(a)

    a + 1. # no `.+` here

end



function inplace!(ar)

    ar .= add1.(ar)

end

Since it is a small function, it should automatically get inlined by the compiler. You can also give a hint to the compiler by using @inline macro (annotate your function with @inline.)

@btime inplaceadd1!($ar2)

# 1.198 μs (0 allocations: 0 bytes)

@btime inplace!($ar1)

# 1.155 μs (0 allocations: 0 bytes)

edited Nov 5 at 4:52

answered Nov 5 at 4:37

hckr

1,359718

Ah, smart. Thanks, just what I was looking for.
– arch1190
Nov 5 at 4:52

add a comment |

up vote
1
down vote

You should instead write your function element-wise and let the loop fusion do its work. This is a more Julia way if you are defining element-wise operations.

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(a)

    a + 1. # no `.+` here

end



function inplace!(ar)

    ar .= add1.(ar)

end

Since it is a small function, it should automatically get inlined by the compiler. You can also give a hint to the compiler by using @inline macro (annotate your function with @inline.)

@btime inplaceadd1!($ar2)

# 1.198 μs (0 allocations: 0 bytes)

@btime inplace!($ar1)

# 1.155 μs (0 allocations: 0 bytes)

edited Nov 5 at 4:52

answered Nov 5 at 4:37

hckr

1,359718

Ah, smart. Thanks, just what I was looking for.
– arch1190
Nov 5 at 4:52

add a comment |

up vote
1
down vote

You should instead write your function element-wise and let the loop fusion do its work. This is a more Julia way if you are defining element-wise operations.

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(a)

    a + 1. # no `.+` here

end



function inplace!(ar)

    ar .= add1.(ar)

end

Since it is a small function, it should automatically get inlined by the compiler. You can also give a hint to the compiler by using @inline macro (annotate your function with @inline.)

@btime inplaceadd1!($ar2)

# 1.198 μs (0 allocations: 0 bytes)

@btime inplace!($ar1)

# 1.155 μs (0 allocations: 0 bytes)

edited Nov 5 at 4:52

answered Nov 5 at 4:37

hckr

1,359718

You should instead write your function element-wise and let the loop fusion do its work. This is a more Julia way if you are defining element-wise operations.

function inplaceadd1!(ar)

    ar .= ar .+ 1.

end



function add1(a)

    a + 1. # no `.+` here

end



function inplace!(ar)

    ar .= add1.(ar)

end

Since it is a small function, it should automatically get inlined by the compiler. You can also give a hint to the compiler by using @inline macro (annotate your function with @inline.)

@btime inplaceadd1!($ar2)

# 1.198 μs (0 allocations: 0 bytes)

@btime inplace!($ar1)

# 1.155 μs (0 allocations: 0 bytes)

edited Nov 5 at 4:52

answered Nov 5 at 4:37

hckr

1,359718

edited Nov 5 at 4:52

answered Nov 5 at 4:37

hckr

1,359718

answered Nov 5 at 4:37

hckr

1,359718

answered Nov 5 at 4:37

hckr

1,359718

Ah, smart. Thanks, just what I was looking for.
– arch1190
Nov 5 at 4:52

add a comment |

Ah, smart. Thanks, just what I was looking for.
– arch1190
Nov 5 at 4:52

Ah, smart. Thanks, just what I was looking for.
– arch1190
Nov 5 at 4:52

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Name

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk