Question Stream vs tt

Cringer

ProgressTalk.com Moderator
Staff member
I've been asked to improve the efficiency of a data dump as it currently takes forever to run. There's some inherent nastiness that I will be removing, but a thought occurred to me. At the moment it opens a load of streams, and outputs the data line by line. Would it be more efficient to put the data required into temp tables and then to export to the files at the last moment in a nice tight loop?
 

TheMadDBA

Active Member
Short answer... Probably not going to be faster, especially since creating the TTs is going to take additional disk IO/memory.

Longer answer...

On most modern versions of UNIX and even Windows there are caching mechanisms that avoid physical writes to disk unless a certain chunk size is ready. Unless you specify UNBUFFERED every export/put/etc is probably not actually writing to disk. In my experience most of the time for dumps are (in this order)

1) Reading records you don't need to (bad WHERE clauses and/or incomplete indexes)
2) Order of the data on disk (if the data is spread out then it needs to more physical IO to read the records)
3) How fast the disks are able to read the data
4) How fast the disks are able to write the data

You can get some performance gains in certain cases by using OUTPUT THROUGH (compress/zip/pack/etc). This will require some percentage of extra CPU to accomplish but it can help if CPU isn't your bottleneck and the files are large enough.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I've been asked to improve the efficiency of a data dump as it currently takes forever to run. There's some inherent nastiness that I will be removing, but a thought occurred to me. At the moment it opens a load of streams, and outputs the data line by line. Would it be more efficient to put the data required into temp tables and then to export to the files at the last moment in a nice tight loop?

How do you define "data dump"? Is it all fields of all records? Some fields of some records that satisfy a query? Something else?

What is the "nastiness"?
 

RealHeavyDude

Well-Known Member
Nowadays I often use a longchar where I collect the output and then use the copy-lob statement to write it to a file in one step instead of dealing with streams. But here I have to admit that I neither did I chose this because of performance considerations nor do I have any evidence that it will be faster.

Heavy Regards, RealHeavyDude.
 

Cringer

ProgressTalk.com Moderator
Staff member
Thanks for all the input. Much appreciated.

@Rob the whole thing is nasty. The data dumped piecemeal, satisfying arbitrary criteria. The issue is that it's dumped out for use in an Access system (I know!), and they didn't want to spend money on the enhanced license costs, so each file has a size limit. Add to that that a load of the data is calculated. In other words a standard dump won't help.
The nastiness will hopefully be removed, but it's stuff like FOR EACH CUSTOMER run program containing FOR EACH SITE... :(
 

GregTomkins

Active Member
"I often use a longchar where I collect the output and then use the copy-lob statement to write it to a file in one step instead of dealing with streams."

Hey, that's a neat idea! I guess you are concatenating delimiters yourself? You probably have some generalized way of doing that.

I wonder if the overhead of doing that in ABL is worth it - my vague observation is that manipulating individual bytes is something ABL is not very good at, performance-wise. By contrast I am *guessing* the standard PUT/EXPORT etc. are optimized C and runs like the wind. Just guessing, of course.
 

RealHeavyDude

Well-Known Member
That is correct - I am inserting delimiters myself. Examples for the usage would be OS scripts that I create on the fly that will then be executed via os-command or replacement of place holders in ini or pf files which I load with copy-lob and then just do a replace and write them back out with copy-lob.

Since I do not need to deal with really "large" files that way I never had any performance issues so far.

Heavy Regards, RealHeavyDude.
 
Top