finding text in a HTML doc

vadilloe

New Member
Hi,

I need to find a given string (text) in a HTML document, i´ve tried out but i cant do it, Does any of you have an idea??
My opsys XpSp2, my Progress version 9.1d,
T.I.A.

VAD
 

joey.jeremiah

ProgressTalk Moderator
Staff member
the proversion and os are very nice, but what are you searching for :) ( how
would you find it ) ?

are there other options besides screen scraping ? one of the problems is the
web page can change frequently

i'd bet there are plenty of screen scraping utils and articles on the net
 

joey.jeremiah

ProgressTalk Moderator
Staff member
this'll work

Code:
function searchStr returns logical

    ( pcFile as char, pcStr as char ):



    define var cEdit as char no-undo.

    form
        cEdit view-as editor size 40 by 10 large
    with frame frmEdit.



    define var ok as logi no-undo.

    ok = cEdit:read-file( pcFile ).
    if not ok then return no.

    ok = cEdit:search( pcStr, 0 ).
    return ok.

end function. /* searchStr */

if you'd like something more efficient you could start by looking at sarls.p

http://www.progresstalk.com/showthread.php?t=94831&highlight=sarls

hth
 

bulklodd

Member
2 joey.jeremiah

Great!!!

But unfortunately there's a problem it doesn't work with -b as a background process it returns absolutely different result.
 

joey.jeremiah

ProgressTalk Moderator
Staff member
stupid me :)

widgets and batch mode, not a good idea



second try ...

Code:
/* counts the number of times a str shows up in a file */



define stream stSrc.

function searchStr returns int

    ( pcFile as char, pcStr as char ):



    define var iLength  as int no-undo.
    define var iCnt     as int no-undo.

    assign
        iLength = length( pcStr )
        iCnt    = 0.



    define var rBuffer  as raw  no-undo.
    define var cBuffer  as char no-undo.
    define var cTrail   as char no-undo.

    assign
        length( rBuffer )   = 1024
        cBuffer             = ""
        cTrail              = "".



    input stream stSrc from value( pcFile )

        binary no-convert. /* codepage conv can be very costly with large files, up to you */

    repeat:



        /* read file in increments of 1k */

        import stream stSrc unformatted rBuffer.

        cBuffer = cTrail + get-string ( rBuffer, 1 ).



        /* search thru the buffer */

        define var iCursor  as int no-undo.
        define var iLast    as int no-undo.

        assign
            iCursor = 1
            iLast   = iCursor.

        repeat:

            if length( cBuffer ) - iCursor + 1 < iLength then leave. /* length doesn't fit in space left */

            iCursor = index( cBuffer, pcStr, iCursor ).

            if iCursor = 0 then leave.


            
            assign
                iCursor = iCursor + iLength
                iLast   = iCursor
                iCnt    = iCnt + 1.

        end. /* repeat */



        /* there could be cases where the search str falls between buffers 
         * so a trail is carried on to the next buffer */

        if iCursor <= length( cBuffer ) then

        cTrail = substr( cBuffer, max( iLast,
                 length( cBuffer ) - ( iLength - 1 ) + 1 ) ).

    end. /* repeat */

    input stream stSrc close.



    return iCnt.

end function. /* searchStr */
 

bulklodd

Member
As I promised a tweaked version:

Code:
function searchStr2 returns logical
    ( pcFile as char, pcStr as char ):
 
 
    define var cEdit as char no-undo.
    form
        cEdit view-as editor size 40 by 10 large
    with frame frmEdit.
 
    cEdit:source-editor = YES.
 
    define var ok as logi no-undo.
    ok = cEdit:read-file( pcFile ).
    if not ok then return no.
    DEFINE VARIABLE i AS INTEGER    NO-UNDO.
    i = cEdit:source-command("find " + quoter(pcStr) + ",R" ,"").
    RETURN i = 0.
end function. /* searchStr */
 
/* Search for 'FIND LAST' or 'FIND FIRST' expressions in the filename*/
MESSAGE searchStr2(filename,"(FIND|FOR)\32*LAST")
   VIEW-AS ALERT-BOX INFO BUTTONS OK.
 
Top