[Progress Communities] [Progress OpenEdge ABL] Forum Post: RE: Loading schema into a new, empty database is slow (prodict/load_df.p)

Status
Not open for further replies.
D

dbeavon

Guest
The tips about -r and -i were helpful. It appears that when I add those to my single-user load program (along with -1) then the DF load time consistently drops from ~60 seconds down to about ~53 seconds. That's seven seconds less time that I will have to twiddle my thumbs! As far as my SimpleStructure.st goes, it is a simplification of what we have in production. It is just enough to load the actual DF's from production into an empty database: # b BuildSchema.b1 # d "Schema Area":6 BuildSchema.d1 # d "abc1_log":7 BuildSchema_7.d1 # d "abc1_log_idx":8 BuildSchema_8.d1 # d "abc2_gl":9 BuildSchema_9.d1 # d "abc2_gl_idx":10 BuildSchema_10.d1 # d "abc2_je":11 BuildSchema_11.d1 # d "abc2_je_idx":12 BuildSchema_12.d1 # d "abc2_jehdr":13 BuildSchema_13.d1 # d "abc2_jehdr_idx":14 BuildSchema_14.d1 ... These are Type I by default I believe. The actual structure from production looks more like so: # d "Schema Area":6,64;1 /dbdev/develop/lum/lum.d1 f 1000000 d "Schema Area":6,64;1 /dbdev/develop/lum/lum.d2 # d "abc2_gl":9,64;64 /dbdev2/develop/lum/lum_9.d1 f 1000000 d "abc2_gl":9,64;64 /dbdev2/develop/lum/lum_9.d2 f 500 ... I don't know what the block size would be. It would be the windows default, I suppose. I think it was 4KB, not 8KB like in HP-UX. >>the process is CPU-bound and there's a limit to what my single-hamster-driven machine will do But what if you could get four or eight hamsters running at once? That is what I'm really going for. I'm wondering if I can arbitrarily chop up the DF and create a number of databases to compile against, rather than just one. Or maybe before compiling there could be a way to merge the areas created in multiple databases, so that they are combined into a single database? The real problem is that load_df.p is written in single-threaded ABL. Maybe Progress should rewrite the dump_df/load_df in a multi-threaded way (or maybe they divide up the work and kick off some baby-batch.p to load it concurrently). That way more of my CPU cores will actually get put to use!

Continue reading...
 
Status
Not open for further replies.
Top