ABL2DB

TomBascom

Curmudgeon
10.2B -- I'm not finding a limit on the 1st argument to ENTRY() -- I am up to 90,000 at the moment. 490 seconds to get here.
 

Stefan

Well-Known Member
The factor 1000 was compared to the 'optimised' entry version with line count outside the loop.
 

Stefan

Well-Known Member
I would hate to see the actual code that generates a 10MB listing much less 1GB :)

Nothing personal Stefan :(

Ha ha. We went completely dynamic and had a lovely nice small code base. My secondary goal at the time was to fit our installer for two progress versions and three databases on my 64mb USB stick. That goal was achieved.

But...

Performance sucked and dynamic was doing more than we actually had a clue about. So we switched to letting the dynamic business logic generate a static version of itself specific for that entity. Performance went up, a lot of fluff was found. And the generated code was easier to debug. The generator on the other hand is currently at 20k lines and you need to be in a quiet zone when working on it. ;-)
 

Stefan

Well-Known Member
10.2B -- I'm not finding a limit on the 1st argument to ENTRY() -- I am up to 90,000 at the moment. 490 seconds to get here.
On 11.4 x64 my file with 174k lines was entried without issue (if you don't consider how long it took an issue ;-))
 

TomBascom

Curmudgeon
On my system 180,000 entries = 1976 seconds... 200,000 is 2442 seconds.

If I have done my math properly it isn't 1,000x slower -- the difference is N vs (N * ( N + 1 )) / 2. Which might as well be N^^2 / 2. And to compare you divide by N so the difference between the 2 approaches becomes N/2. So with N = 170,000 ENTRY() should be roughly 85,000 times slower than INDEX() et al. I think.

For files that a person might actually read it makes no practical difference. For big stuff it will matter.
 

tamhas

ProgressTalk.com Sponsor
I think we have multiple things being confused here. The *original* code had num-entries executed multiple times, e.g., for the do loop limit. Not a problem with small files, but a problem with big ones, so that got moved outside the loop and set once.

But, the other part of the problem here is walking through the file line by line. I.e., the entry() side of this is
Code:
do inWhich = 1 to inMax:
  chLine = entry( inWhich, lcFile, "~n").
  inCount = inCount + 1.
end.

The real code obviously does something more interesting than just extract the line and count. My problem with entry blowing up above was due to forgetting the "~n" delimiter on the entry inside the loop.

The listing I am testing with now is just under 10MB. 1GB is frightening, but I think Tom is thinking of log files and the like.

My results:
entry() 588s
index .4s
Wow!
 

TomBascom

Curmudgeon
Yes, I am testing with .lg files :) In real life I have better ways of working with those but they make a good large text file test bed.

Your test above is roughly 3,000 lines? (588 / .4 ) * 2

I'm still perfectly ok with ENTRY() for small stuff -- it is clean, elegant and easy to understand and explain.

But obviously anything big enough to take a second or so to process, especially if you are doing things with a bunch of such files, is going to need a more optimized approach.
 

Stefan

Well-Known Member
BuildSchemaDF is missing two tokens:

Code:
Begin List of BuildSchemaDF exceptions
BuildSchemaDF WARNING: Unexpected token 'FOREIGN-NAME' in TableAdd at line 7 processing DB exactcs
BuildSchemaDF WARNING: Unexpected token 'LABEL-SA' in ColumnAdd at line 17 processing DB exactcs

foreign-name used to be for DataServers, but my Progress df is full of 'n/a' as value.
 

Stefan

Well-Known Member
BuildBlocks threw 1048 exceptions:

Code:
Exception list for object build of c:\tfs\ef71\1\work\bl,c:\tfs\ef71\1\work\i,c:\progress\oe11.4\src\web
...
Table not found for buffer strolrec in compile unit c:\tfs\ef71\1\work\bl\assets\ifallgen\gdpr_trns.p
Table not found for buffer bufas_asmb in compile unit c:\tfs\ef71\1\work\bl\assets\ifallgen\gfasasmb.p
...
Table not found for buffer b2stformula in compile unit c:\tfs\ef71\1\work\bl\tools\system\convertids.p
End of BuildBlocks with 574391 total blocks in 2393 compile units of 2419 total files

I see that the buffer -> table conversion is hard-coded, but also that any attempt to make this intelligent will require a lot of work.

The other part being from the truncated file names in the listing (as also mentioned on communities.progress.com):

Code:
...
File '...b\method\admweb.i' not found in compile unit c:\tfs\ef71\1\work\bl\system\modu\gethtmlbasestring.p
...

These could be found by using a matches find to find the file if it starts with ... - if matches returns a unique record it can be considered found.
 

Stefan

Well-Known Member
The COMPILE LISTING allows PAGE-WIDTH and PAGE-HEIGHT options - setting these to their maximum values should help:

PAGE-SIZE integer‑expression
Identifies the number of lines to a page in the listing file. The default page size is 55 and integer-expression must be between 10 and 127, inclusive.
PAGE-WIDTH integer-expression
Identifies the number of page columns in the listing file. The default page width is 80, and integer-expression must be between 80 and 255, inclusive. Add at least 12 spaces to the page width when you type the file. This allows you to list information that precedes each line of code, ensuring that the file appears in the listing output exactly as you typed it.
 

tamhas

ProgressTalk.com Sponsor
The buffer name matching is clearly hard work. I think the only real hope here ... other than a shop with strict naming conventions of a simple sort ... is to get Gilles parser working so that we actually know what table the buffer is pointing at.

The mangled names is a real PITA.

I will experiment with the page options, but I don't believe that will change the format of the block summary or do much but reduce the number of line wraps and page breaks. One needs a page-size 0 for no page breaks and a page-width 0 for no wrap.
 

tamhas

ProgressTalk.com Sponsor
BTW, there *is* logic which scans all of the includes in the compile unit and looks for matches using the mangled name, so it would be interesting to figure out why that is failing in some cases.
 

tamhas

ProgressTalk.com Sponsor
A new release of ABL2DB has been published incorporating the following changes:

0.71 - Add handling for encrypted source
0.72 - Add handling for reverse slash in .df triggers
0.73 - Add support for SA fields in table and column schema
0.74 - Add support for CLOB-* properties in .df
0.75 - Corrections for \, <> "", and "*.cls"
0.76 - Speed enhancements to BuildBlocks.cls
0.77 - Accommodate large line numbers in BuildBlocks.cls
0.78 - Add FOREIGN-NAME and LABEL-SA in BuildSchemaDF
0.79 - Page size and width options in CompileDirectoryTree
0.80 - Index instead of entry in BuildBlocks.cls
0.81 - Shift BuildTableLinks structure to handle class references that look like
database references
0.82 - Add optional DB alias handling

Find the code and revised documentation at http://www.oehive.org/ABL2DB
 

Cecil

19+ years progress programming and still learning.
Will the ABL2DB work with SpeedScript and WebSpeed Code? i.e. .html and cgi.w files?
 

tamhas

ProgressTalk.com Sponsor
Currently, it is analyzing the XREF and Listing output of compiling ABL. Anything that will produce those can be analyzed. Anything that doesn't it can't. It is now polite about including encrypted source in the structure, but obviously it can't know anything about the insides of the encrypted source. We could probably do something similar with non-ABL components, but I would need to know more about the details.
 

tamhas

ProgressTalk.com Sponsor
Yesterday, I released a new version of ABL2DB. This version incorporates Proparse to extend the range of functionality possible with XREF and LISTING. This first Proparse release provides for tracking of shared variable definitions and usage.

As the first release incorporating Proparse, this is a fairly big leap, hence the jump in version numbers. I am expecting that, as it is applied to new code bases, some adjustment is likely to be required as I have already discovered some "interesting" issues related to "vintage" coding practices. But, the basic structure should allow for expansion to other areas in the not too distant future. My next target is going to be resolving buffer names to tables.

The downloads are available here http://www.oehive.org/ABL2DB
Please read the instructions! :)
 

tamhas

ProgressTalk.com Sponsor
ABL2DB 1.08 has now been published. This includes the code for buffer name resolution, tracking of all shared object types, a shift to use XREF to determine if a file has compiled, improved handling for logical names and aliases, and some Proparse utilities like TokenLister.cls for showing how Proparse handles a particular piece of code.
 

tamhas

ProgressTalk.com Sponsor
A new version has been published including a number of enhancements, most conspicuously tracking of whether shared variables are assigned to or from in any given compile unit. This is ideal for shared variable refactoring since, among other things, it makes it very obvious which shared variables are not referenced in a compile unit. See http://www.oehive.org/node/2286 for a sample report.
 
Top