Convolve two arrays in c++

B

Thread Starter

Bjoles

Does any one know if c++ allows you to convolve (add) two arrays (two dimensional) without having to access the arrays element by element? I'm trying to do some image processing and I am getting killed by the processtime using for loops.
Thanks,
Bryan
 
G

Gary Choplin

I hope this helps. Arrays can only be manipulated element by element.

How big are the arrays and what type of loop structure are you using? nested loops or one big loop?
 
If this were a newsgroup, I'd ask you to post your current loops to see if there were any glaring optimizations missed.

There are multiple improvements you could make to the algorithm for adding two arrays. If you are using array objects, you may be forced to go
through element by element. But if it's a plain old 2 dimensional array (as in C) you could:

a) create a one-dimensional array over the same elements and iterate with a single index.
b) use pointer variables rather than referencing elements of the array with subscripts.
c) if you know the adds are not going to overflow their element type, you might overlay an array with a larger element type, so adds can handle multiple bits in parallel. e.g. overlay a byte array of 100 elements (8 bits each) with a
long int array of 25 elements (32 bits each).
d) Drop down into assembler for the array add.
e) I don't know for a fact, but I belive that Pentium class processors have instructions designed for multi element adds and subtracts, specifically for image processing (MMX and stuff...)
f) process the elements that are actually being displayed first, display the result, and finish processing the rest in the background.

Is this good for you, or do you need code samples? Or rationale for the suggestions?

Rufus
 
V

Vladimir E. Zyubin

To optimize processtime write the instructions in C... Or in Assembler. Or try to use the specific structures of the arrays (if there are any)...
for example, convolve the parts of the array you need only.. (if it can be done)... and there is the way to parallelize the culculation (to use
several processors)

I see you use a C++ library, it is not the best way to program in the situation.

--
Best regards,
Vladimir mailto:[email protected]

== there is no magic in programming at all.. and in OOP in particular ==
 
J
>Does any one know if c++ allows you to convolve (add) two arrays (two
>dimensional) without having to access the arrays element by element?

This is not a language issue. You need to look into how the processor works and write an assembly language routine that best uses the
machine's intruction set to accomplish this. There is probably an optimized library routine already available for this.


Jay Kirsch
[email protected] <mailto:[email protected]>
 
C

Chuck Karwoski

Here's a simple way to reduce the interations for summing the contents of a 100 element array

// 4 times faster than adding items one at time
int sum = 0;
for (int i=0; i<100 ; i+=4)
sum += (array1 + array1[i+1] + array1[i+2] + array1[i+3] );


You can tweak this to add as many items at a time as needed
for (int i=0; i<100 ; i+=10)
sum += (array1 + array1[i+1] + array1[i+2] + array1[i+3]...
array[i+9] );
 
Chuck,

I think you missed something in the original post. Bryan was looking for a way to add two arrays, not sum the elements of an array.

The technique you show is unrolling the loop, basically doing multiple operations per iteration to reduce the number of iterations. But the speed
improvement is FAR less than the multiple. You are still doing the same number of array references and array element adds. In fact, a poorly optimising compiler might even have some additional additions (the i+1, i+2, etc.) array references. The only time you are saving is the time spent in the actual iteration mechanism.

This approach can be extremely effective in PLC's, where each loop is a separate "scan". But that wasn't the question.

Rufus
 
A

Alex Pavloff

Bryan:

Well, firstly, the advice people are giving you about unrolling the loops and using pointer variables is probably misleading -- compilers are pretty damn smart about optimizing loops like that nowadays. (This of course, assumes that you have a semi-modern compiler). However, what seems strange is that you say you're getting "killed" by the process time in loops.
Unless you've got a slower platform or large array sizes, this shouldn't really be a problem. If you have a profiling tool, I would recommend using that to verify that what you think is the slow part ACTUALLY is the slow part. (And also, "slow" is relative <g>)

If this is just an array of simple elements, about the only thing that you can use to improve the speed is some processor dependent stuff. For
example, take a look at "http://www.mindshare.com/Mindshare/mmx.htm":http://www.mindshare.com/Mindshare/mmx.htm and see if you can use MMX. However, you ARE going to have to write some inline assembly in your code.

You may also be able to find a C++ library that will do some of these operations for you. Take a look at the various libraries at
"http://www.trumphurst.com/cpplibs/datapage.phtml":http://www.trumphurst.com/cpplibs/datapage.phtml and see if one of those can help you at.

Good luck!

Alex Pavloff
Software Engineer (C++ programmer)
Eason Technology
 
B

Brian Martinicky

In addition to Mr. Pavloff's excellent advice, I might also add that if you are using C++, or more specifically, an instance of some C++ array class,
you should get familiar with how it is implemented. For example, in MFC's CArray class, the subscript operator [] is overloaded to call Get/Set methods that actually get/set element values. What this means is that a simple line like this:

ArraySum = ArrayOne + ArrayTwo;

*really* works out to be:

ArraySum.SetAt( i, ArrayOne.GetAt( i ) + ArrayTwo.GetAt( i ) );

Not what you expected?

As you might imagine, there will be a significant difference in execution time between the simple indexing & addition found in the first example and
the three method calls & math of the second.

If you are not already, either use raw C-style arrays directly, or a C++ class that is optimized for operations like yours...

Regards,
Brian Martinicky
Automation Intelligence, Inc.
 
M

Michael Griffin

At 12:38 03/06/01 -0400, Bjoles wrote:
<clip>
> Does any one know if c++ allows you to convolve (add) two arrays (two
> dimensional) without having to access the arrays element by element? I'm
> trying to do some image processing and I am getting killed by the
> processtime using for loops.
<clip>

If the microprocessor you are using has a memory management unit with a cache (you haven't said what you are using), you may want to dig out
your compiler documentation and see how it constructs two dimensional arrays. You would need to find out which index addresses consecutive memory addresses. If your entire array will not fit in the cache and you pick the wrong index for your inner loop, you may have slower performance. Large data arrays can have a significant effect on a cache, so you need to know how the compiler will interact with the particular hardware you are using.

For example, if you have array A[x,y], then if index 'x' addresses consecutive memory locations, while 'y' addresses elements which are 'x' locations apart, then the inner loop should index 'x', and the outer loop should index 'y'. This way when a block of memory is fetched from the main RAM and placed into the cache, the inner loop will work on data which is
already in the cache.
If however, index 'y' accesses consecutive memory locations, then using 'x' as the inner loop has the potential of having every memory access cause a cache fault and fetch data from main memory. This is actually slower than if you had no cache at all.

I'm not sure that the above is what your problem is, as you are not very specific about how slow "slow" is. Before you worry about this level of detail, I would suggest that you follow the recommendations of various other people and make sure that your C++ compiler isn't inserting a lot of slow code into your program that you don't know about.


**********************
Michael Griffin
London, Ont. Canada
[email protected]
**********************
 
A

Alex Pavloff

> For example, in MFC's CArray class, the subscript operator [] is
overloaded to call Get/Set
> methods that actually get/set element values. What this means is that a
simple line like this:
>
> ArraySum = ArrayOne + ArrayTwo;
>
> *really* works out to be:
>
> ArraySum.SetAt( i, ArrayOne.GetAt( i ) + ArrayTwo.GetAt( i ) );
>
> Not what you expected?
>
> As you might imagine, there will be a significant difference
> in execution time between the simple indexing & addition found in the
> first example and the three method calls & math of the second.

Actually, I'll argue with that. You're right in that the array operators end up calling GetAt and SetAt. However operator[] (non-const version), actually calls ElementAt, which, in release mode, just indexes right into the C array and returns a reference to the element. This is likely to be optimized out into the exact same call that you would have made in C. No speed loss there. The same pattern is followed with every single one of the CArray classes (and all the STL classes too, which are, in my opinion, much nicer to use).

C++ isn't intrinsically slower than C... what I find is a killer though when using container classes like this is that the copy constructor is invoked at various times, and the cost of copying an object ends up making the entire system slow to a crawl. After running a profile once, I changed two lines of code at one point and reduced a 30-minute operation to 20 seconds because I was doing too much work. I never even suspected the culprit.

Nothing beats a profile. Make it work, make it right, and THEN make it fast.
 
B

Brian Martinicky

You are right. I did not mean for my post to serve as a comprehensive tutorial on the MFC CArray class, but rather to illustrate that when using objects with overloaded operators, appearances can be deceiving. Looking back, I see that I could have done a better job...

Regards,
Brian
 
S
hi,
u can't convolve 2 arrays without tracing array one by one item because u never get the addresses of the element without traversing,
it is not possible and if u able to do this work please also give me tips about that thing.

thanks
deepak
 
hi,
i want to say u something about the above topic
u have no other choice to add to array with tracing the element by element.

there is no other choice.
deepak
 
No, but you can probably speed it up... (Anyway, Matlab-style "A+B" does the same thing, only it doesn't tell you.)

First, turn on optimization in your compiler, or try various combinations of optimization options.


If you're adding corresponding elements, you can treat the arrays as one-dimensional (cast if necessary).

You can partially unroll the loops (ie, do four or eight in each iteration), but be careful to make sure the whole loop fits in the cache. For a famous unrolling hack, look up "Duff's device" on the Internet.


When you do these things, make sure you actually measure which one's fastest, because it's not always obvious.


Assembly language is probably not a good idea - today's CPUs are superscalar and you'd have to know far too well how the pipeline works.

Jiri
--
Jiri Baum <[email protected]> http://www.csse.monash.edu.au/~jirib
MAT LinuxPLC project --- http://mat.sf.net --- Machine Automation Tools
 
C
Use the Source, Luke!

I'll bet if you were to poke around on the Scientific Applications on Linux site, you might find what you are looking for. Other freeware sites might be helpful but the Linux sites typically have more stuff in source code form.


"http://sal.kachinatech.com/":http://sal.kachinatech.com/

A simple Google Search for "convolving matrices" also got quite a few hits to shuffle through.

Regards

cww
 
J

Johan Bengtsson

Note that completely regardless of how you do it all elements have to be accesed one by one, so also in matlab, by the processor so it is bound to take some cosiderable processor time anyway.

Some tips:
- moving pointer is faster than array indexing
- assembler can be used to speed up things. Really clever assebly code could probably improve things with a factor 2 to 10 in such cases
- datatype, are you using floting point (slow) or integer math (fast)
- if you use say 8 bit or 16 bit integer math and have access to a MMX processor there are instructions availiable taking several numbers at a time (4, or possibly 8). But to use these effectively you probably have to write it in assembler anyway.

The best you can do is if the bandwidt to and from memory is the limiting factor....


/Johan Bengtsson

Do you need education in the area of automation?
----------------------------------------
P&L, Innovation in training
Box 252, S-281 23 H{ssleholm SWEDEN
Tel: +46 451 49 460, Fax: +46 451 89 833
E-mail: [email protected]
Internet: http://www.pol.se/
----------------------------------------
 
A

Anthony Kerstens

Or try using pointers to do the math. That would work directly
on the memory locations and save some time, but it would
overwrite one of the arrays.

Anthony Kerstens P.Eng.
 
D

David Bergeron

I don't know any other way to do this in C++, but Matlab will allow you to do this by just adding them such as A+B where A and B are equal dimensioned matrices. Matlab was built around the use of Matrices and does Matrix manipulation very efficiently.

Another solution to improve processor time could be to write a subroutine in assembly to do it. Boy, doesn't that sound like fun.

David Bergeron, P.E.
Thompson Equipment Co.
125 Industrial Ave
New Orleans, LA 70121
http://www.thompson-equipment.com
Phone: 504-833-6381
========================================
 
C

Carsten Weinhold

guess, your problem here is that arrays are not directly supported by c++, and 2-dim
arrays are mimicked by using a "vector-of-vectors" representation. sure, doing the calculation directly on the allocated memory (using pointers) speeds things up -- depending on the "internal" matrix representation (if it's a linked list, forget about that). there are some free c-libraries for matrix operations and image manipula- tion available, and this would be the faster way if you are not an experienced c++ pro- grammer (look for linux / GNU scientific software). if you need speed, you should go for a special hardware anyway. finally -- are you really sure that you want to do this element-wise addition of matrizes? depending on what you want to achieve, there might be other algorithmic ways (look into the vast image processing literature).
 
Top