c++ - OpenMP share file handler -

i've got loop, parallelize using openmp. in loop, read triangle file, , perform operations on data. these operations independent each triangle another, thought easy parallelize, long kept actual reading of files in critical section.

order in triangles read not important
some triangles read , discarded pretty quickly, need more algorithmic work (bbox construction, ...)
i'm doing binary i/o
using c++ ifstream *tri_data*
i'm testing on ssd

readtriangle calls file.read() , reads 12 floats ifstream.

#pragma omp parallel shared (tri_data) for(int = 0; < ntriangles ; i++) {     vec3 v0,v1,v2,normal; #pragma omp critical     {         readtriangle(tri_data,v0,v1,v2,normal);     }     (working triangle here) }

now, behaviour i'm observing openmp enabled, whole process slower. i've added timers code track time spent in i/o method, , time spent in loop itself.

without openmp:

total io in time       : 41.836 s. total algorithm time   : 15.495 s.

with openmp:

total io in time       : 48.959 s. total algorithm time   : 44.61 s.

my guess is, since reading in critical section, threads waiting eachother finish using file handler, resulting in longer waiting time.

any pointers on how resolve this? program benefit possibility process read triangles multiple processes. i've tried toying thread scheduling , related stuff, doesn't seem lot in instance.

since i'm working on out-of-core algorithm, introducing buffer hold multitude of triangles not option.

so, solution propose based on master/slave strategy, where:

the master (thread 0) performs i/o
the slaves work on retrieved data

the pseudo-code read following:

#include<omp.h>  vector<vec3> v0; vector<vec3> v1; vector<vec3> v2; vector<vec3> normal;  vector<int> tdone;  int nthreads; int triangles_read = 0;  /* ... */  #pragma omp parallel shared(tri_data) {   int id = omp_get_thread_num();   /*    * initialize buffers in master thread.    * notice size in memory similar example.    */ #pragma omp single   {     nthreads = omp_get_num_threads();     v0.resize(nthreads);     v1.resize(nthreads);     v2.resize(nthreads);     normal.resize(nthreads);     tdone.resize(nthreads,1);   }    if ( id == 0 ) { // producer thread      int next = 1;      while( triangles_read != ntriangles ) {       if ( tdone[next] ) { // if next thread free         readtriangle(tri_data,v0[next],v1[next],v2[next],normal[next]); // read data , fill correct buffer         triangles_read++;         tdone[next] = 0; // set flag thread next start working #pragma omp flush (tdone[next],triangles_read) // flush       }       next = next%(nthreads - 1) + 1; // set next     } // while    } else { // consumer threads      while( true  ) { // wait work                         if( tdone[id] == 0) {         /* ... work here on v0[id], v1[id], v2[id], normal[id] ... */         tdone[id] == 1; #pragma omp flush (tdone[id]) // flush          }             if( tdone[id] == 1 && triangles_read == ntriangles) break; // work finished     }    } #pragma omp barrier  }

i not sure if still valuable nice teaser anyhow!

Search This Blog

Babette

c++ - OpenMP share file handler -

Comments

Post a Comment

Popular posts from this blog

node.js - Bad Request - node js ajax post -

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -