ABL2DB

Stefan

Well-Known Member
Restarted with a fresh analysis database, BuildSubUnits is still failing:

Code:
[14/09/08@18:03:58.047+0200] P-005684 T-007948 1 4GL APPL           Start BuildSubUnits
[14/09/08@18:04:04.276+0200] P-005684 T-007948 1 4GL APPL           End BuildSubUnits with 2393 exceptions

BuildSubUnitsExceptions.txt contains the same messages:
Code:
Exception list for build subunits of c:\tfs\ef71\1\work\bl,c:\tfs\ef71\1\work\i
BuildSubUnits: DiskFile not found for c:\tfs\ef71\1\work\bl\assets\ifallgen\gdpr_cald.p
BuildSubUnits: DiskFile not found for c:\tfs\ef71\1\work\bl\assets\ifallgen\gdpr_calm.p

Hmmm diskfile.d only contains include files (.i) - aha SourceExtensions does not include 'p' - CompilableExtensions are "p,cls" and SourceExtensions are "cls,i" - while removing the 'noise' from SourceExtensions I also wiped out "p". Restarting... :)
 

tamhas

ProgressTalk.com Sponsor
I noticed the comment about only .i earlier and meant to say something, but ...
 

Stefan

Well-Known Member
:)

Two suggestions for the feature list:

1. database aliases
2. some kind of progress indicator - BuildBlocks has now been burning one core for over 24 hours (11.4 x64 on Vista x64 on 2.6 ghz i7 with code and db on an SSD) and I have no clue as to how far it is, analysis_7 is 60 mb, analysis_8 is 20 mb - since this was only intended to be a quick test the analysis database is connected single user so I can't go in and watch record counts soar :)

The log so far:

Code:
[14/09/08@19:26:03.328+0200] P-000788 T-005304 1 4GL -- Logging level set to = 2
[14/09/08@19:26:03.328+0200] P-000788 T-005304 1 4GL -- No entry types are activated
[14/09/08@19:26:03.329+0200] P-000788 T-005304 1 4GL -- Logging level set to = 3
[14/09/08@19:26:03.329+0200] P-000788 T-005304 1 4GL APPL           Start ABL2DB Test
[14/09/08@19:26:04.349+0200] P-000788 T-005304 1 4GL APPL           Start CompileDirectory Tree
[14/09/08@20:28:25.639+0200] P-000788 T-005304 1 4GL APPL           End CompileDirectory Tree with 16 exceptions
[14/09/08@20:28:25.701+0200] P-000788 T-005304 1 4GL APPL           Start BuildSchema
[14/09/08@20:28:25.702+0200] P-000788 T-005304 1 4GL APPL           End BuildSchema with 0 exceptions
[14/09/08@20:28:25.726+0200] P-000788 T-005304 1 4GL APPL           Start BuildDiskFiles
[14/09/08@20:28:43.666+0200] P-000788 T-005304 1 4GL APPL           End BuildDiskFiles with 0 exceptions
[14/09/08@20:28:43.701+0200] P-000788 T-005304 1 4GL APPL           Start BuildSubUnits
[14/09/08@22:11:53.768+0200] P-000788 T-005304 1 4GL APPL           End BuildSubUnits with 164 exceptions
[14/09/08@22:11:53.801+0200] P-000788 T-005304 1 4GL APPL           Start BuildBlocks

Just having a quick look at BuildBlocks.cls.

The ProcessBlocks method is counting the number of lines in the file, number of lines in the file times (do /repeat to function evaluates function for every iteration of loop) and then doing that twice - my lst file directory contains 3.3 gb spread over 2394 files -> average size is 1.2 mb per file, list files seem to be wrapped at 80 chars -> so that's an average of 15000 lines per file -> a lot of unnecessary counting:

Code:
    copy-lob from file ipchListFile to mlcListFile.
    do minLine = 1 to num-entries( mlcListFile, "~n" ):
      if entry(minLine, mlcListFile, "~n" ) begins "-------------------- ---- ----------- ---- --------------------------------"
      then leave.
    end.

    repeat while minLine < num-entries( mlcListFile, "~n" ):
      minLine = minLine + 1.
...

Count the number of lines once and then use the count.

Code:
    copy-lob from file ipchListFile to mlcListFile.
    ilines = num-entries( mlcListFile, "~n" ).
    do minLine = 1 to ilines:
      if entry(minLine, mlcListFile, "~n" ) begins "-------------------- ---- ----------- ---- --------------------------------"
      then leave.
    end.

    repeat while minLine < ilines:
      minLine = minLine + 1.
...
 
Last edited:

Stefan

Well-Known Member
BuildBlocks finished after 25 hours after which a syserror occurred:

Code:
[14/09/08@19:26:03.328+0200] P-000788 T-005304 1 4GL -- Logging level set to = 2
[14/09/08@19:26:03.328+0200] P-000788 T-005304 1 4GL -- No entry types are activated
[14/09/08@19:26:03.329+0200] P-000788 T-005304 1 4GL -- Logging level set to = 3
[14/09/08@19:26:03.329+0200] P-000788 T-005304 1 4GL APPL           Start ABL2DB Test
[14/09/08@19:26:04.349+0200] P-000788 T-005304 1 4GL APPL           Start CompileDirectory Tree
[14/09/08@20:28:25.639+0200] P-000788 T-005304 1 4GL APPL           End CompileDirectory Tree with 16 exceptions
[14/09/08@20:28:25.701+0200] P-000788 T-005304 1 4GL APPL           Start BuildSchema
[14/09/08@20:28:25.702+0200] P-000788 T-005304 1 4GL APPL           End BuildSchema with 0 exceptions
[14/09/08@20:28:25.726+0200] P-000788 T-005304 1 4GL APPL           Start BuildDiskFiles
[14/09/08@20:28:43.666+0200] P-000788 T-005304 1 4GL APPL           End BuildDiskFiles with 0 exceptions
[14/09/08@20:28:43.701+0200] P-000788 T-005304 1 4GL APPL           Start BuildSubUnits
[14/09/08@22:11:53.768+0200] P-000788 T-005304 1 4GL APPL           End BuildSubUnits with 164 exceptions
[14/09/08@22:11:53.801+0200] P-000788 T-005304 1 4GL APPL           Start BuildBlocks
[14/09/10@00:30:31.745+0200] P-000788 T-005304 1 4GL APPL           End BuildBlocks with 1280 exceptions
[14/09/10@00:30:31.748+0200] P-000788 T-005304 1 4GL SYSERROR       Driver: Unexpected Exception: ** Invalid character in numeric input (. (76)
[14/09/10@00:30:31.748+0200] P-000788 T-005304 1 4GL APPL           End ABL2DB

The invalid character error seems to be bubbling out of BuildBlocks - but all the exceptions are missing 'databases', since I did not provide the dfs:

Code:
Exception list for object build of c:\tfs\ef71\1\work\bl,c:\tfs\ef71\1\work\i
Database exactcs not found for buffer dpr_cald in compile unit c:\tfs\ef71\1\work\bl\assets\ifallgen\gdpr_cald.p
Database exactcs not found for buffer dpr_cald in compile unit c:\tfs\ef71\1\work\bl\assets\ifallgen\gdpr_cald.p
...
Database exactcs not found for buffer debout in compile unit c:\tfs\ef71\1\work\bl\finance\ifallgen\gdebout.p
End of BuildBlocks with 166963 total blocks in 333 compile units of 334 total files

334 total files is way too low. The last file in the exceptions the file resulted in the unexpected exception. The exception is caused by Progress dealing with line numbers as 16 bit signed integers. These are represented as:
- lines 1 - 32767: 1 - 32767
- line 32768: -(
- lines 32769 - 65535: -32768 -> -1
- lines 65536 - : 1 - 32767 - have fun differentiating these from lines 1 - 32767

excerpt from finance/ifallgen/gdebout.lst:

Code:
...fallgen\gdebout.p 32760 Do          No                                    
...fallgen\gdebout.p   -( Do          No                                    
...fallgen\gdebout.p -32745 Do          No                                    

<snip>

  32766     rops.table_id = ttfilters.link_table AND bu4ttIfallProps.field_id = 
  32766   9 ttfilters.link_filter_field.
  32767   9 
 -32768  10                               IF ttfilters.link_one_on_one THEN DO:
 -32767  10                                  ASSIGN
 -32766  10                                     lvalue_found = FALSE
 

tamhas

ProgressTalk.com Sponsor
What do you mean about aliases that you can't do with the connection parameters?

A progress indicator is a problem for something I imagine being used in production as a batch process. In my testing I have been doing OK by looking at the directory listing of logs for the time stamps and occasionally looking at one of the logs, e.g., if it is growing a lot.

The line numbering is clearly a Progress bug. I think one of us should report it to tech support, but that person would need to have the source in question ... i.e., I will be happy to lead the report, but would need that source. One could comment on source files with over 32000 lines in them or even 65000!!!! My sympathies on any developer having to make sense of that.

Do you have any suggestions of how to handle it other than simply error checking the assignment and leaving it zero?

I will look at the looping.
 

Stefan

Well-Known Member
I can handle a logical and a physical name with connection parameters but I cannot, as far as I know, create aliases in a connect statement.

CREATE ALIAS is the statement we use after connecting.

The line number issue is a long standing issue which we ran into a long time ago. We replace -) with 32768 and negative with 65536 + line number. Over 65536 we are simply hosed and need to cross our fingers that there is no collision.

These large sources are the output of a generator and contain massive include files with preprocessor constants for among others table ids and field ids.

As to progress indicator, simply unbuffering the exception output would help, now the output is buffered by the OS.
 

tamhas

ProgressTalk.com Sponsor
OK, unbuffering and the arithmetic is easy.

I'm still not sure what you want to do with the alias.
 

Stefan

Well-Known Member
I'm still not sure what you want to do with the alias.

We support OpenEdge, Oracle and SQL server databases (the latter two using the DataServers) with the same code base. Oracle and SQL have schema holders which contain meta info (_file etc), in the OpenEdge database the schema holder and the actual database are one and the same. For the OpenEdge database we create a schema holder alias. All code that needs to get _file info can then simply access exactcssh._file for all database platforms.
 

tamhas

ProgressTalk.com Sponsor
OK, so if the code has the alias, I presume that all the XREF entries would show those references with the exactcssh database name, no? And, if so, why not supply a .df for exactcssh with the relevant metaschema files such as I did in the same with DICTDB (which was for unqualified metaschema references)?
 

Stefan

Well-Known Member
Are you suggesting connecting the same database twice? Without the sh alias the code will not compile since they are explicitly referenced with exactcssh._file and edissh._file. All normal access is implicitly referenced and is made explicit at compile time.

An alternative could be to connect the database with logical name the schema holder in which case code explicitly referencing the actual database (none, except for some DataServer send-sql stuff) would not compile.

I think I will just create the aliases in the launcher, much less workaroundish.
 

tamhas

ProgressTalk.com Sponsor
The schema related files in ABL2DB are created from the supplied DFs, not from connected databases. The compile is done with the database connections you specify. They are not linked ... as long as the results of one make sense to the results of the other.

BTW, when you figure out a combination that works, I would love a description to add to the site for the next person with a similar problem.
 

tamhas

ProgressTalk.com Sponsor
ABL2DB 0.76 has the changes to BuildBlocks.cls and is on OE Hive. It will be interesting to see how much difference this makes since this function only takes about an hour on my source sample with a similar number of files ... but not nearly the size of files. I get that taking num-entries only once is more efficient, but I also wouldn't have thought it was a particularly expensive operation.
 

tamhas

ProgressTalk.com Sponsor
Just for giggles I ran a little benchmark on a file of about 90,000 lines, albeit shortish ones since it was a .df. 10,000 num-entries on a longchar containing the file took just under 16 seconds or 0.0016 per. At that rate it would take 54,000,000 num-entries executions for 24 hours. Given 3000 files, that would be 18,000 num-entries per file on average. I suppose that, if you have line counts over 65000, this is quite possible.
 

Stefan

Well-Known Member
The schema related files in ABL2DB are created from the supplied DFs, not from connected databases. The compile is done with the database connections you specify. They are not linked ... as long as the results of one make sense to the results of the other.

Yes, and for my code base to compile exactcssh and exactcs must both be valid database references. I've hacked the schema holders into the Driver class with fixed naming convention:

Code:
    def var cdb as char no-undo.

    do minWhich = 1 to num-entries( chDataBaseList ):
      cdb = entry( minWhich, chDataBaseList ).
      connect value( cdb ) value( entry( minWhich, chDataBaseConnectParams, "|" )).
      create alias value( substitute( "&1sh":u, cdb ) ) for database value( cdb ).
    end.

BTW, when you figure out a combination that works, I would love a description to add to the site for the next person with a similar problem.
 

tamhas

ProgressTalk.com Sponsor
So, in the version where you have an OpenEdge DB, you actually have two aliases for the same DB and the XREF references will reflect both? Does this mean that all your table references in the code are fully qualified with DB names?

BTW, what does
- line 32768: -(
mean? That this line is just omitted in the listing?

I.e., I am doing the conversion and then going to convert this to a hopefully real number and I'm wondering what linLineNo value will have from assign minLineNo = integer(mchLineNo) no-error. if this line is referenced.
 

Stefan

Well-Known Member
Just for giggles I ran a little benchmark on a file of about 90,000 lines, albeit shortish ones since it was a .df. 10,000 num-entries on a longchar containing the file took just under 16 seconds or 0.0016 per. At that rate it would take 54,000,000 num-entries executions for 24 hours. Given 3000 files, that would be 18,000 num-entries per file on average. I suppose that, if you have line counts over 65000, this is quite possible.

:)

I just ran a num-entries 'benchmark' on my largest .lst file - its 9.5 mb and has 174,186 lines.

Code:
DEF VAR lcc AS LONGCHAR NO-UNDO.

COPY-LOB FROM FILE ("C:\Temp\abl2db\work\lst\finance\ifallgen\gatrss-general.p.lst") to lcc.

DEF VAR itime AS INT NO-UNDO EXTENT 5 INITIAL {&SEQUENCE}.
DEF VAR iline AS INT NO-UNDO INITIAL 1.
DEF VAR ilines AS INT NO-UNDO.
DEF VAR cline AS CHAR NO-UNDO.


itime[{&SEQUENCE}] = ETIME.


DO WHILE iline < NUM-ENTRIES( lcc, "~n" ):

   cline = ENTRY( iline, lcc, "~n" ).
   iline = iline + 1.

END.

itime[{&SEQUENCE}] = ETIME.

iline = 1.
ilines = NUM-ENTRIES( lcc, "~n" ).

DO WHILE iline < ilines:

   cline = ENTRY( iline, lcc, "~n" ).
   iline = iline + 1.

END.

itime[{&SEQUENCE}] = ETIME.

MESSAGE
   itime[2] - itime[1] SKIP
   itime[3] - itime[2]
VIEW-AS ALERT-BOX.

I said 'just' and wanted to put the results up - but my client is choking again.

Time to do some maths - a single num-entries on this file takes approx 15 ms (100 num-entries = 1645 ms, 1000 num-entries takes 15525 ms) - repeat that unnecessarily 174,186 times and you have just wasted 2600 seconds = 43 minutes for nothing on one file...

Note that my run stopped with the fatal on the line number overflow after only 334 files - 24 hours for 334 files - I have 3000 files... :)
 

tamhas

ProgressTalk.com Sponsor
Well, hopefully that is cured by the current version on OE Hive.

What about the - line 32768: -( question?
 

tamhas

ProgressTalk.com Sponsor
BTW, a 175,000 line lst file is absolutely mind boggling for ABL. What size is the corresponding debug listing file?
 

Stefan

Well-Known Member
So, in the version where you have an OpenEdge DB, you actually have two aliases for the same DB and the XREF references will reflect both? Does this mean that all your table references in the code are fully qualified with DB names?

No, there is the logical name and there is an alias (so yes there are two names for the same databases) - all normal tables are not fully qualified - since table names are unique over all databases, all _file references are fully qualified.

Code:
for each debtor no-lock:
end.

for each exactcssh._file no-lock:
end.

BTW, what does
- line 32768: -(
mean? That this line is just omitted in the listing?

I.e., I am doing the conversion and then going to convert this to a hopefully real number and I'm wondering what linLineNo value will have from assign minLineNo = integer(mchLineNo) no-error. if this line is referenced.

-( is how 32768 is displayed

Interestingly this bug also used to exist in the Progress.Lang.Error callStack - which is where we ran into it when generating errors to find out which line calls were coming from, but this has apparently been fixed:

Code:
DEF VAR ii AS INT NO-UNDO.

OUTPUT TO "c:/temp/overflow.p".

PUT UNFORMATTED
   'FUNCTION showLine RETURNS CHARACTER ():' SKIP(1)
   '   DEF VAR ii AS INT NO-UNDO.' SKIP
   '   ii = INTEGER( "a" ).' SKIP
   '   CATCH e AS Progress.Lang.Error:' SKIP
   '      RETURN ENTRY( 2, e:callStack, "~~n" ).' SKIP
   '   END CATCH.' SKIP
   'END FUNCTION.' SKIP (1)

   'SESSION:ERROR-STACK-TRACE = TRUE.' SKIP(1).

DO  ii = 1 TO 130535:

   PUT UNFORMATTED SKIP (1).

END.

PUT UNFORMATTED
   'MESSAGE showLine() VIEW-AS ALERT-BOX.' SKIP .

OUTPUT CLOSE.

RUN c:/temp/overflow.p
 

Stefan

Well-Known Member
BTW, a 175,000 line lst file is absolutely mind boggling for ABL. What size is the corresponding debug listing file?
The lst is 175,000 lines, the original generated source file is 58,312 lines, the debug-listing is 112,742 lines.
 
Top