Fortran calling C: How do I get an efficient vectorised function -
i have call c function fortran, want in vectorised loop. working intel 16.0.3 compilers on linux.
so options are: can try , function inline or can use simd function (i want use openmp simd this, want portable , use openmp).
if call fortran fortran works both ways. passing arguments use linear/ref clause pass reference vector of values rather vector of references , seems work efficiently. in c linear/ref clause not recognised.
i can function nominally vectorise inserting gathers , scatters , performance no better scalar (at least small test function).
if put linear(ref(r,s))
in fortran interface block message
the uval or ref modifiers must not used on dummy argument value attribute.
i can performance using trick of passing value fortran , returning value function return. produces vectorised function, , performance good, unfortunately real function needs return more 1 value.
if try inline c function, won't work. opt-report tells me callsite cannot inlined. true scalar function. cannot c functions inline fortran @ all. using ipo try , this. have wondered if inlining problem fortran passing pointer pointer? code gives right answers seems somehow acceptable.
the test code (passing pointers) essentially...
fortran caller
use, intrinsic :: iso_c_binding …. real*8, allocatable :: r(:),s(:) . …. interface integer simd_c_func(r,s) bind(c, name="simd_c_func") !$omp declare simd(simd_c_func) import :: c_double real(kind=c_double), intent(inout):: r,s end interface allocate.... !$omp simd i=1,n ierror=simd_c_func(r(i),s(i)) enddo
c callee
#pragma omp declare simd int simd_c_func(double *r, double *s) { (*r)+=(*s); return 0; }
the linear(ref())
pretty new, 1 can hope linear using omp , contiguous can joy.
the message:
the uval or ref modifiers must not used on dummy argument value attribute."
indicates doing value , on stack. thought c arrays in heap
? maybe need -heap
switch?
c-side:
i suggest add linear
#pragma
#pragma omp declare simd linear(r,s)
have looked @ using #pragma vector aligned
? , #pragma ivdep
? assuming (*r) reference (in heap)
f90-side
i suggest replace:
real(8), allocatable :: r(:), s(:)
with
!$dir attributes align:64 :: r, s real(kind=8), dimension(:), allocatable, contiguous :: r, s
the real change here 64bit alignment , contiguous. 1 can use switch -align array64byte
if ierror
not array want one? or perhaps need use first/last private or reduction clause on ierror
? use reduction.
!$omp simd
goes to:
!$omp simd reduction(ierror)
but mention need more 1 value maybe need allocate ierror
?
and interested in how interface knows gets return value of zero. seem have have integer function somewhere in interface.
linking-compiling side:
if c produced f90-
module use allow see deep enough c code inline or otherwise help. may need -ipo
on compiler/link allow f90 understand can c callee.
Comments
Post a Comment