Page 1 of 1

Does Photomatix make use of 2 cores?

Posted: Thu Mar 01, 2007 12:14 pm
by Paul Skoczylas
Does the Photomatix HDR software make use of multiple core processors? I know I'll find the answer to this by experience in a couple of days, but I'm too curious to wait. (My new computer is being assembled painfully slowly, since I have very limited free time. But I'm to the point where I can start reloading software now. :D )

Also, I wonder if it can make use of extra memory. I have 4 GB in there--I need to read EJ's instructions on setting the 3GB switch--and I wonder if it can make use of it. For that matter, can Panorama Factory make use of it? How about PSE2? (I know CS2 can, but that doesn't mean PSE2 can... I'll be upgrading to CS3 when it comes out, but not until then.)

Thanks,

-Paul

Posted: Thu Mar 01, 2007 12:46 pm
by E.J. Peiker
None of Photomatix's documentation addresses this - guess you'll just have to test it :)

Posted: Thu Mar 01, 2007 6:03 pm
by Royce Howland
Photomatix currently does not use more than 1 core in either the merge to HDR function, or the tone mapping functions. Also it has some practical memory use limits that are well below the maximum memory available with the /3GB boot switch, in terms of the final HDR file size. Exactly where those limits are is a bit fuzzy; there are two separate large file functions that can be employed to help with big jobs and they both have different limits.

One is accessed via the Automate > Batch Processing menu, which can generate a .hdr file from several stitched pano's using a "strip-by-strip" option to increase the effective size of the HDR file that can be worked on. I haven't yet hit a practical size limit with this option, but selecting strip-by-strip does disable the tone mapping batch functions. Therefore you can only generate the .hdr file, you can't process it in any other way via strip-by-strip.

Once you have a large .hdr file, you can then use HDR > Large File Processing menu item to tone map it using a command-line version of the tone mapper which cuts memory overhead by stripping down the memory consumed by the GUI. This mode doesn't have a strip-by-strip type of processing, so it does have to tone map the entire file at once. While this operation is less memory intensive than actually doing the initial merge to HDR, it will hit a limit. The Photomatix folks feel a file of about 100MP can be tone mapped via Large File Processing on a Windows computer with 2GB of RAM. I don't know what the limit is with 4GB of RAM, and whether setting the /3GB switch will change anything. Personally I doubt it since in my experience very few Windows applications support more than 2GB of virtual memory space, and that is most likely the limiting factor here.

The other downside with Large File Processing is that it only supports the Details Enhancer tone mapping function, currently. There is no way to tone map a large file using the Tone Compressor function. I've just been talking the developer about this very thing. Large file support for Tone Compressor probably will not make it into the 2.4 release when it ships (available in beta now), but it is on the to-do list and perhaps will show up with the 2.5 release.

On the other software, Panorama Factory can make use of multiple cores and is very fast in this configuration. PF also supports the extra memory available via the /3GB switch. However its ability to process large stitches is much more limited than Autopano Pro for example, which uses a cell-based rendering mechanism that side-steps memory limitations to process more or less arbitrarily large pano's. With Panorama Factory, I have hit memory-based stitching limits before, and my impression has been that it's limited to creating final files that are a little over 1GB in size in memory. This is supported by statements from the PF developer, John Strait; see: http://www.panoramafactory.com/discus/m ... 0/126.html

Note that PF does support 64-bit Windows, which effectively eliminates the Windows-based 2GB memory space limitation for applications designed for 64-bit operation. I've never run 64-bit Windows but PF on that configuration should be a screamer and handle massive files.

PSE2 can't make use of multiple cores. Nor can it make use of extra memory. I have 4GB and the /3GB switch enabled on my dual-core AMD machine. CS2 sees 2450GB of available RAM, while PSE3 sees only 1777MB available. This is essentially the difference between support (or not) for >2GB of virtual memory space.

Note that with most tools like these, processing 8-bit files instead of 16-bit files will substantially increase the resolution of the images that can be processed (usually about 2X), although quite possibly at the cost of reduced image quality. Since my workflow is to stitch first and process for HDR second, and HDR processing involves a lot of tonal manipulations that benefits from 16-bit source images, I maintain a full 16-bit workflow from start to finishing in CS2. Support for very large 16-bit files is one of the reasons I'm evaluating Autopano Pro to augment Panorama Factory for stiching work...

Posted: Thu Mar 01, 2007 6:40 pm
by Paul Skoczylas
Thanks, Royce! I figured you'd have the answer. :wink:

I'm a little disappointed that Photomatix can't make use of two cores. Speed of that program is the main reason I decided to upgrade. Still, I know that the Core2Duo will still be MUCH faster than the P4 I had before, even with only one core running the program.

Have you talked to the developers about a multiple core upgrade for the future? Mathematically, I know that some operations are well suited to parallelization, while others are not. What I don't know is if the HDR calcs are suited to it or not--if not, then it won't happen, but if they are, it sure would be nice to have!

I haven't done huge HDR files like you (I think I've only ever done a multiple frame HDR once, and that was only two frames), so the memory issues are much less important to me than speed for the HDR work.

I'm glad PF can make use of the dual core and extra memory, though! It could be a bit slow at times--where I would just go away and leave it for a while. (Conversely, Photomatix requires too much operator intervention to leave it.) I'll have to try to run the big 40 frame stitch that I've never been able to complete before, when I get the machine up and running.

PSE2 is not as big a deal to me, as compared to the others. My P4 with 2GB never really seemed to bog down for more than a second or two on anything I ever asked it to do in PSE2.

-Paul

Posted: Thu Mar 01, 2007 7:10 pm
by Royce Howland
Paul, I haven't brought up multi-core yet with the Photomatix folks, although I'd be pretty surprised if it isn't on the list already. :) Some operations (such as the strip-by-strip batch HDR merge function) would be well suited to a cell-based approach (the strips are really just giant cells). This could be useful both for multi-core processing, as well as large file memory optimization which is the current motive for having it. Global tone mapping operators like the Tone Compressor function also should be straight forward to implement this way.

With other operations, particularly local operator, frequency domain and gradient domain type tone mapping functions (like the Details Enhancer), cell-based processing might be tougher to develop but possibly still doable; I don't know enough about the internals of the image transforms to say for sure. It is possible with some of these functions that breaking the work up into cells might introduce so much overhead in dividing the work, processing it and combining the final results, that it would defeat the performance benefit from multiple cores. (See http://luminance.londonmet.ac.uk/webhdr ... ping.shtml for a quick overview of these terms.)

To date there probably hasn't been enough demand for HDR processing of large files using the "off the shelf" software like Photomatix for multi-core and memory-optimal functions to be developed. People doing this kind of work most likely are running more specialized software.

Realistically, I don't think speed will be an issue for you even with Photomatix taking only a single core of your new machine. Your box should outperform mine (which is coming up on a year old now) by a decent margin, and I find mine to be very acceptable for my HDR work. Tone mapping big pano's takes only a few tens of seconds. Stitching burns more time by far, including the actual processing work as well as the effort to fine tune merges & blends where the automatic stitcher functions don't quite get it right.

Posted: Thu Mar 01, 2007 7:22 pm
by E.J. Peiker
At least with multi core machines, the machinecan still do other things well while the tone mapping is going on. This is not the case with single core machines where the machine pretty much is useless for even email during tone mapping. It can take a while with 5 1Ds2 files being tone mapped into a single photo so its nice to be able to do something else with the computer while its off doing that.

Posted: Fri Mar 02, 2007 2:59 am
by Paul Skoczylas
Well, I gave it a quick try on the Core2Duo E6400 w/ 4GB (and the 3/GB switch set)...

PF was amazingly fast. But unfortunately, that only meant that my 43 image stitch (consisting of 8 bit frames from a 20D) got to the "Out of Memory" error that much faster. (Within a couple of minutes, as opposed to half an hour on the older machine.) Both cores were clearly in use, with the total capacity running over 90%. I look forward to trying some more reasonable stitches on this machine!

Photomatix wasn't nearly as fast as I'd have liked. Yes, it's certainly faster than it was, but not amazingly so. (Still, it will make doing HDR less painful, for sure. I may not need to have a game of FreeCell going in the background like I used to!) The processor activity was interesting: both cores were being used, but not fully. This was visible in the graphs in the Task Manager. The total capacity ran between 50-60%, which does seem to indicate that it is doing better than a single core would--but only slightly better.

Oh, and this was the first time I'd tried the anti-ghost feature in the Photomatix Beta. Worked like a charm!

PSE2 ran fine of course, but I rarely do things in PS which test my patience, so it's not a real test of the machine like the other two were.

I'm curious: What memory should XP report to me as Total Physical Memory in the Task Manager when I have 4 GB? On this laptop with 1 GB of RAM, it reports 1,040,000 K--pretty much the full amount--but on that machine, it only reports 3,200,000 K (or something similar--I'm not in front of it right now). Is this a problem? Can XP make use of the full 4 GB?

-Paul

Posted: Fri Mar 02, 2007 9:14 am
by E.J. Peiker
Paul, please go back and read the long thread on the /3GB switch for a complete explanation but in short the answer is no. It can't use 4GB and the numbers you are seeing are pretty typical.

As for the out of memory error, one way I have gotten around this is to break up the stitch into several. So for your 43 image stitch, do it in say 5 passes of 9 images each (you will use two frames twice). Then import each one into PS and make crop them so that each one has exactly the same total pixel size in x and y. Now rerun PF and stitch the 5 together. It may take more than one itteration.

I was able to stitch 21 1Ds2 shots together like this which is about the same as 43 8MP frames.

Posted: Fri Mar 02, 2007 11:27 am
by Royce Howland
Re: the Panorama Factory large stitch work-around described by E.J., that certainly will work. However it's an example of the type of extra effort that I'm personally getting tired of with large projects in PF. No direct support for large stitches, no direct support for multi-row stitches, no direct support for stitching multiple exposure sequences with the same control points, etc. There are hacks and work-arounds for all of these but I'm spending a lot of time hacking and working around. :)

This is why I'm reevaluating Autopano Pro now that their version 1.3 has been out & stabilized for awhile. I still like PF for smaller projects because it is fast, simple and produces very clean output with minimal effort. But for bigger projects I need something that works better. Autopano supports very large projects (apparently arbitrarily large), directly stitches multi-row projects even if involving an oddball patchwork of input images, and can directly stitch multiple exposure sequences in one go because it was designed to natively handle HDR projects. So I'm hopeful it will do what I need for my large projects. Autopano is slower than PF by a considerable margin, but I'm spending lots of hands-on time with PF right now anyway. With Autopano hopefully I can just set it to run and then walk away and do something else useful with my time.

Re: Photomatix, the use of both cores at the 50-60% level is what E.J. was talking about in terms of being able to do some other things on the box while an HDR process is running. Photomatix will take essentially 100% of one core (50% of the machine's capacity) during heavy computations, while the other core will be available for Windows background processes and any other apps you may be running.

How fast or slow was "not nearly as fast as I'd have liked"? There may be some things that can help. For example I always disable the Undo feature in Photomatix, which consumes less memory (thus increasing the size of files that can be processed) and may speed up tone mapping a bit due to not having to sling large undo memory regions around. If it's clear that I'm going to crop an image, I also do this early (i.e. during RAW conversion or stitching) to provide smaller files for Photomatix to chew on. Also some of the alternative modes of operation may run faster because they are command line processes that are disconnected from a Windows GUI. For example, try using Automate > Batch Processing to generate your HDR files and see if that's faster.

As a point of reference, on my machine I just used Automate > Batch Processing to generate a .hdr file from three 16-bit TIFF files, each 14439x3938 pixels in size and with a file size of 325MB. The resulting .hdr file is also 14439x3938 and is 167MB on disk. I then used HDR > Large File Processing to run Details Enhancer on the .hdr file and produce a 16-bit TIFF image, again 14439x3938 and 325MB on disk. Timings:
  • - Merge 3 TIFFs -> .hdr: 75s
    - Tone map .hdr: 205s
A total processing time of 280s may seem lengthy, but considering that it's a ~57MP 16-bit image being processed, it ain't bad. A single-frame 8MP image should be drastically faster, probably about 40s or less of net processing time. With large files like I timed, a healthy chunk of the time is spent in disk I/O operations; multi core support wouldn't much help the I/O bound part of the process.

Posted: Fri Mar 02, 2007 12:00 pm
by Paul Skoczylas
Royce,

How do you do batch processing? I like to tweak the settings to optimize each picture, so I can't see how I would use the batch mode.

Regarding the Undo, I use that to switch between DE and TC mode--I do the DE, save, undo, and then do the TC. Do you think it would be faster to save the HDR, and reload it (with undo turned off)? I should try that. I've never actually saved the HDR file itself.

The vast majority of my HDR work is a single 30D frame, with 3 exposures (sometimes 5 exposures). Only once have I done a multi-frame HDR, and even that was only two frames (x 3 exposures) With that size of project, there's a bit of waiting at each step of the process, but not enough to go do anything else (besides make a few moves in FreeCell). It's just slow enough to be annoying. The new machine is certainly better, but no panacea. (It probably won't be worth having FreeCell open anymore, though.)

The 50-60% I mention is not just one core going all out. As I said above, both cores are definitely involved. From the graphs, I estimate that one is at 70-80% while the other is at 20-30%, for a total of 50-60%. With the task manager showing graphs for both cores, you can see the activity in both jump when you start an HDR job. Clearly there are a lot of independent calculations, and the computer is throwing some to each core. But since the program is not parallelized, it can't make full use of both.

-Paul

Posted: Fri Mar 02, 2007 12:49 pm
by Royce Howland
In the past I have used batch processing only for very large files that I can't merge to HDR in any other way. Typically this means employing strip-by-strip mode, which also disables the ability to tone map. So I'm not actually using batch for tone mapping, just for HDR merging.

Large image tone mapping can be done also using HDR > Large File Processing which currently supports only Details Enhancer. However, unlike batch, it does pull up the preview screen where you can set tone mapping parameters with some visual feedback. Once you have things looking right, the GUI is thrown away and the tone mapping proceeds by a command-line process which I imagine is the same one that would be used by the batch tone mapping function when strip-by-strip mode is not active.

Batch tone mapping normally would be used for processing multiple image sequences shot in the same conditions. You'd work up the first sequence visually and sort out what tone mapping settings worked properly. Then you'd process the remaining sequences taken in the same shooting conditions using batch, which lets you walk away from the machine while it crunches. I don't do a lot of HDR shooting that would be workable with this kind of tone mapping approach, but I can see some situations where it may be helpful. E.g. studio shooting or interior architectural shooting where lighting is more controlled.

Turning off undo would be faster only if you normally saved your HDR file anyway, or were not routinely running multiple tone mappers on the same image sequence. (It would also be useful to free up memory if you needed to process larger images.) If you're not already saving it, the extra I/O steps to save & load the HDR file may be more time consuming than just using undo like you're doing now. I save my merged HDR files, treating them sort of like a "super RAW" file for revisiting tone mapping later much as a person might revisit a RAW conversion with a new or alternative RAW software package.

Of course new features like ghost removal in Photomatix 2.4 mean I may need to throw the HDR file away and reprocess from scratch. It's questionable whether it's worth keeping any file at all, other than the original RAW and final web & flattened print versions. I know E.J. would say it's not, in the vast majority of cases. :)

But you'd need to examine your total workflow. Saving the HDR files may be beneficial if you can use the batch function to merge numerous HDR sequences all up front, since there is really little UI interaction required for this part of the work. Let batch crunch on the merges while you do something else. Then come back and do tone mapping one at a time using the HDR files, as time permits. This is what I'm moving to, since I can shoot a lot of HDR sequences in any given outing.

The compute-bound parts of Photomatix processing appear to get tied to one core. The jumping around on the graph you're seeing is mostly with the initial setup parts of the processing which may get allocated to different cores. E.g. when merging to HDR initially, the input TIFF files must be loaded up and prepped. These are separate operations that may go to different cores, but only one at a time. Once the actual merge computation starts, from what I can tell it proceeds in one single continuous operation, and is bound to a single core for the duration. Same with tone mapping. These are the two key operations that would benefit from multi-core support, although as mentioned before the complex tone mapping operators may or may not be practical to parallelize.

Whenever the application process needs to call on a Windows system function to do something, the core running the app process may slump while the OS function fires up on the other core. So you may see some zig-zag on the two core graphs even though the application is effectively bound to one core for the duration of large compute runs. The bottom line is that, as you stated originally, topping out at ~60% on a dual core machine indicates there is little or no benefit for processes like this in having two cores over one...

Posted: Fri Mar 02, 2007 1:27 pm
by Paul Skoczylas
Great idea on the batch merge and then manual tone-mapping. I've still got a large number of images to process from Dec. 19--batching the first part of the process might save me a lot of time! Or at least I could go away and do something esle for that part. (And then I'd have the .hdr files--I'd have to see if turning the undo off and loading for each DE/TC is faster than loading once and undoing inbetween.)

I may have another problem for you to look at, Royce. Unless the anti-ghost feature fixes it. One of the series I have is a five exposure set including water flowing in a canyon with both bright sunlight and deep shade. The non-beta version of Photomatix gave some really funky artifacts in the water--I figure it was because of reflective highlights not being in the same place in each shot. If the anti-ghosting doesn't solve it, I'll post an example for your suggestions... (I didn't save the result from the non-beta version, because the artifacts wrecked it.)

-Paul

Posted: Fri Mar 02, 2007 3:01 pm
by Royce Howland
Let me know on the water thing. I've encountered stuff like that before and sometimes just couldn't find an effective way to handle it within Photomatix. The new ghost and water ripple functions may help... then again maybe not. :) I've not used the new functions much yet because I've been spending all of my time beating my head on Lightroom flakiness and large HDR stitches that aren't behaving nicely...