OpenMP - store results in vector [duplicate]

up vote
0
down vote

favorite

This question already has an answer here:

OpenMP multiple threads update same array

2 answers

I want to parallelize a for loop with many iterations using OpenPM. The results should be stored in a vector.

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results.push_back (result_value);

    }

}

This the does not work properly with the #pragma omp parallel for.

So what's the best practice to achieve that?

Is it somehow possible use a separate results vector for each thread and then combining all result vectors at the end? The ordering of the results is not important.

Something like that is not practical because it consumes to much space

int *results = new int[n];

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results[i] = result_value;

    }

}



// remove all unused slots in results array

edited Nov 7 at 13:49

asked Nov 7 at 13:44

flappix

997

marked as duplicate by Zulan, Community♦ Nov 7 at 14:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

1

#pragma omp critical?
– LogicStuff
Nov 7 at 13:48

Yeah but this will slow down the whole process a lot, right?
– flappix
Nov 7 at 13:51

But it will also correct it a lot.
– LogicStuff
Nov 7 at 13:54

add a comment |

up vote
0
down vote

favorite

This question already has an answer here:

OpenMP multiple threads update same array

2 answers

I want to parallelize a for loop with many iterations using OpenPM. The results should be stored in a vector.

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results.push_back (result_value);

    }

}

This the does not work properly with the #pragma omp parallel for.

Something like that is not practical because it consumes to much space

int *results = new int[n];

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results[i] = result_value;

    }

}



// remove all unused slots in results array

edited Nov 7 at 13:49

asked Nov 7 at 13:44

flappix

997

marked as duplicate by Zulan, Community♦ Nov 7 at 14:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

1

#pragma omp critical?
– LogicStuff
Nov 7 at 13:48

Yeah but this will slow down the whole process a lot, right?
– flappix
Nov 7 at 13:51

But it will also correct it a lot.
– LogicStuff
Nov 7 at 13:54

add a comment |

up vote
0
down vote

favorite

This question already has an answer here:

OpenMP multiple threads update same array

2 answers

I want to parallelize a for loop with many iterations using OpenPM. The results should be stored in a vector.

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results.push_back (result_value);

    }

}

This the does not work properly with the #pragma omp parallel for.

Something like that is not practical because it consumes to much space

int *results = new int[n];

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results[i] = result_value;

    }

}



// remove all unused slots in results array

edited Nov 7 at 13:49

asked Nov 7 at 13:44

flappix

997

This question already has an answer here:

OpenMP multiple threads update same array

2 answers

I want to parallelize a for loop with many iterations using OpenPM. The results should be stored in a vector.

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results.push_back (result_value);

    }

}

This the does not work properly with the #pragma omp parallel for.

Something like that is not practical because it consumes to much space

int *results = new int[n];

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

        results[i] = result_value;

    }

}



// remove all unused slots in results array

This question already has an answer here:

OpenMP multiple threads update same array

2 answers

c++ openmp

edited Nov 7 at 13:49

asked Nov 7 at 13:44

flappix

997

edited Nov 7 at 13:49

asked Nov 7 at 13:44

flappix

997

edited Nov 7 at 13:49

asked Nov 7 at 13:44

flappix

997

asked Nov 7 at 13:44

flappix

997

asked Nov 7 at 13:44

flappix

997

marked as duplicate by Zulan, Community♦ Nov 7 at 14:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

marked as duplicate by Zulan, Community♦ Nov 7 at 14:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

1

#pragma omp critical?
– LogicStuff
Nov 7 at 13:48

Yeah but this will slow down the whole process a lot, right?
– flappix
Nov 7 at 13:51

But it will also correct it a lot.
– LogicStuff
Nov 7 at 13:54

add a comment |

1

#pragma omp critical?
– LogicStuff
Nov 7 at 13:48

Yeah but this will slow down the whole process a lot, right?
– flappix
Nov 7 at 13:51

But it will also correct it a lot.
– LogicStuff
Nov 7 at 13:54

#pragma omp critical?
– LogicStuff
Nov 7 at 13:48

Yeah but this will slow down the whole process a lot, right?
– flappix
Nov 7 at 13:51

But it will also correct it a lot.
– LogicStuff
Nov 7 at 13:54

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

The "naive" way:
You can init several vectors (call omp_get_max_threads() to know the thread count inside the current parallel region) then call omp_get_thread_num() inside the parallel region to know the current thread ID, and let each thread write into its vector.
Then outside the parallel region merge the vectors together. This can be worth it or not, depending on how "heavy" your processing is compared to the time required to merge the vectors.

If you know the maximum final size of the vector, you can reserve it before processing (so that push_back calls won't resize the vector and you gain processing time) then call the push_back method from inside a critical section (#pragma omp critical), but critical sections are horribly slow so it's worth it only if the processing you do inside the loop is time consuming. In your case the "processing" looks to be only checking the if-clause, so it's probably not worth it.

Finally, it's a quite known problem. You should read this for more detailed information:
C++ OpenMP Parallel For Loop - Alternatives to std::vector

answered Nov 7 at 14:43

L.C.

1789

add a comment |

up vote
3
down vote

Option 1: If each iteration takes a significant amount of time before adding the element to the vector, you can keep the push_back in a critical region:

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

#pragma omp critical

        results.push_back (result_value);

    }

}

If threads are mostly busy with other things than the push_back, there will be little overhead from the critical region.

Option 2: If iterations are too cheap compared to the synchronization overhead, you can have each vector fill a thread-private array and then merge them at the end:

There is a good duplicate for this here and here.

edited Nov 7 at 13:56

answered Nov 7 at 13:51

Max Langhof

7,1521133

add a comment |

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

Finally, it's a quite known problem. You should read this for more detailed information:
C++ OpenMP Parallel For Loop - Alternatives to std::vector

answered Nov 7 at 14:43

L.C.

1789

add a comment |

up vote
1
down vote

accepted

Finally, it's a quite known problem. You should read this for more detailed information:
C++ OpenMP Parallel For Loop - Alternatives to std::vector

answered Nov 7 at 14:43

L.C.

1789

add a comment |

up vote
1
down vote

accepted

Finally, it's a quite known problem. You should read this for more detailed information:
C++ OpenMP Parallel For Loop - Alternatives to std::vector

answered Nov 7 at 14:43

L.C.

1789

Finally, it's a quite known problem. You should read this for more detailed information:
C++ OpenMP Parallel For Loop - Alternatives to std::vector

answered Nov 7 at 14:43

L.C.

1789

answered Nov 7 at 14:43

L.C.

1789

answered Nov 7 at 14:43

L.C.

1789

answered Nov 7 at 14:43

L.C.

1789

add a comment |

up vote
3
down vote

Option 1: If each iteration takes a significant amount of time before adding the element to the vector, you can keep the push_back in a critical region:

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

#pragma omp critical

        results.push_back (result_value);

    }

}

If threads are mostly busy with other things than the push_back, there will be little overhead from the critical region.

Option 2: If iterations are too cheap compared to the synchronization overhead, you can have each vector fill a thread-private array and then merge them at the end:

There is a good duplicate for this here and here.

edited Nov 7 at 13:56

answered Nov 7 at 13:51

Max Langhof

7,1521133

add a comment |

up vote
3
down vote

Option 1: If each iteration takes a significant amount of time before adding the element to the vector, you can keep the push_back in a critical region:

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

#pragma omp critical

        results.push_back (result_value);

    }

}

If threads are mostly busy with other things than the push_back, there will be little overhead from the critical region.

Option 2: If iterations are too cheap compared to the synchronization overhead, you can have each vector fill a thread-private array and then merge them at the end:

There is a good duplicate for this here and here.

edited Nov 7 at 13:56

answered Nov 7 at 13:51

Max Langhof

7,1521133

add a comment |

up vote
3
down vote

Option 1: If each iteration takes a significant amount of time before adding the element to the vector, you can keep the push_back in a critical region:

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

#pragma omp critical

        results.push_back (result_value);

    }

}

If threads are mostly busy with other things than the push_back, there will be little overhead from the critical region.

Option 2: If iterations are too cheap compared to the synchronization overhead, you can have each vector fill a thread-private array and then merge them at the end:

There is a good duplicate for this here and here.

edited Nov 7 at 13:56

answered Nov 7 at 13:51

Max Langhof

7,1521133

Option 1: If each iteration takes a significant amount of time before adding the element to the vector, you can keep the push_back in a critical region:

for (int i=0; i<n; i++)

{

    // not every iteration produces a result

    if (condition)

    {

#pragma omp critical

        results.push_back (result_value);

    }

}

If threads are mostly busy with other things than the push_back, there will be little overhead from the critical region.

Option 2: If iterations are too cheap compared to the synchronization overhead, you can have each vector fill a thread-private array and then merge them at the end:

There is a good duplicate for this here and here.

edited Nov 7 at 13:56

answered Nov 7 at 13:51

Max Langhof

7,1521133

edited Nov 7 at 13:56

answered Nov 7 at 13:51

Max Langhof

7,1521133

answered Nov 7 at 13:51

Max Langhof

7,1521133

answered Nov 7 at 13:51

Max Langhof

7,1521133

add a comment |

This page is only for reference, If you need detailed information, please check here

jguAVhh4q1kvaD,RJWGddLip fqi8,SPM,i4qKzRZHI

搜尋此網誌

Wsrtjtyk