Is DR up to date.

Jack@dba

Member
HI ALL,

I would like to know whether our Replication database DR is Sync with Production.

In Source database last Ai extents archived is mfgprod.a27. But in DR target database last Ai file copied is mfgprod.a243.
Whenever we check replication status showing "normal processing" and Blocks "Received | Processed" are matching.

1) why there is delay in copying AI files from source to target.
2) If a.27 ai file achieved it needs to update to target within shorter time but it is taking ages to copy still it is at a.243.
3) How to provide both Prod and DR are in Sync.

DB version : 11.7
OS : AIX 7.1
Product Name: Open Edge Repl Plus

Kindly find attachment for more info.
 

Attachments

  • dsrutil monitor.txt
    7.1 KB · Views: 8

TomBascom

Curmudgeon
I'm not sure what your question or issue is.

Your DSRUTIL output shows that both source and target are in "normal processing" and that you have zero seconds latency. So DR is as in sync with production as it can be. That's your "headline question".

You then ask about a delay copying AI files from source to target.

I do not know what you think you are seeing that indicates such a delay. For that matter I'm not at all clear what you mean by "copying AI files from source to target".

Archiving after-image extents is not necessarily the same as "copying from (replication) source to (replication) target". Archiving extents happens outside of replication. It can, and sometimes is, *delayed* by replication. For instance a LOCKED extent cannot be archived, you have to wait for replication to finish replicating that extent. But I don't see anything here indicating that anything like that is happening.

If your concern is a difference between the time stamps for extents being archived then you are misinterpreting how the two sets of extents are managed. Replication is not copying ai files, per se, from source to target. It is streaming the notes within those ai files. But the ai files themselves, on both sides, can be set up very differently. You could, for instance, have the two sides using time based extent archiving on completely different schedules. Or you could use fixed length extents of different sizes. In fact you are not actually required to have after-imaging enabled on the target at all (it's a good idea, but it isn't mandatory). Timestamps on after-image extents aren't telling you anything about how well synchronized you are.

Aside from that - why on earth do you have 243 after-image extents on the target? That is an unheard of number of after-image extents. Please explain why that is.

You have at least 27 on the source - that is also a very large number of after-image extents. I only rarely use more than 16 AI extents.
 

Jack@dba

Member
Hi Tom

Thanks for the details.

To answer to your question

Aside from that - why on earth do you have 243 after-image extents on the target? That is an unheard of a number of after-image extents. Please explain why that is.

This set up is suggested by the progress vendor long back 3 years and to make room for Ai extent to grow. We are following it our bad.

Even we are using Ai mgmt on-demand mode only once ai file growth. Client is identified recent data and they are asking to provide evidence both source and target databases are synchronized with latest data?is their any way to know number of notes sending from source side is matching with target?
 

TomBascom

Curmudgeon
As I said: "Your DSRUTIL output shows that both source and target are in "normal processing" and that you have zero seconds latency. So DR is as in sync with production as it can be."

To verify that you are in sync:

1) run "dsrutil dbname -C status -verbose" on *both* source and target. (It is important to do both - there are cases where one thinks that it is in normal processing but the other disagrees.)

2) verify that there are no LOCKED after-image extents on the source

3) On the target run "dsrutil dbname -C monitor", select "a" for "agent status". You will see a screen like this:

Code:
   OpenEdge Replication Monitor                  Page 1

    Database:  /db/trax/xus61t2

    S.  Replication server status
    R.  Replication server remote agents
    A.  Replication agent status
    I.  Replication inter-agent status

    M.  Modify display defaults
    Q.  Quit

    Enter your selection: a

   OpenEdge Replication Monitor                  Page 1

    Database:  /db/dbname

    Database is enabled as OpenEdge Replication:  Target

    Agent:
        Name:                                   agent1
        ID:                                     1
        Host name:                              192.168.101.127
        State:                                  Normal Processing
        Ready:                                  Yes
        Critical:                               No
        Method:                                 Asynchronous
    Agent is waiting for:                       Nothing
    Maximum bytes in TCP/IP message:            16704
    Server/Agent connection time:               Tue Mar 14 05:14:28 2023
    Delay Interval (current / min / max):       15 / 5 / 500
    Transition information:
        Type:                                   Manual
    The last block received at:                 Tue Mar 14 07:58:02 2023
    Activity information:
        Blocks received:                        175603460
        Blocks processed:                       175603460

RETURN - show remaining, U - continue uninterrupted:

        Blocks acknowledged:                    0
        Notes processed:                        31919737840
        Transactions started:                   848051247
        Transactions ended:                     848051244
        Synchronization points:                 43777
    AI Block Information:
        Source RDBMS Block (Seq / Block):       14806 / 125354
        Last Processed Block (Seq / Block):     14806 / 125353
    Latency Information:
        Repl Server behind Source DB by:        2     second(s)
        Current Source Database Transaction:    24926554422
        Last Transaction Applied to Target:     24926554408
        Target Current as of (Target, Source):  Tue Mar 14 07:58:00 2023, Tue Mar 14 07:58:00 2023 with delta of 000:00:00
    Connect timeout:                            1800
    Listener port range:                        4201-4299
    Current listener port:                      4201
    Additional transition information:
        Replication set:                        1
        Database role:                          Reverse
        Transition to agents:                   agent2,agent3
        Restart after transition:               1
        Automatically begin AI:                 ?
        Automatically add AI areas:             0

RETURN - show remaining, U - continue uninterrupted:

In this case the database is 2 seconds behind the source:

Repl Server behind Source DB by: 2 second(s)

(If you really need to you can screen-scrape DSRUTIL to script that check.)

You can try use DSRUTIL or the VSTs on the source and target to compare the number of notes or transactions or blocks or whatever and attempt to use that data as verifcation that you are "in sync". But that only works under unrealistic scenarios - it fails if the source and target databases or if the replication server/agent are started & stopped at different times because the counters are not coordinated - they are reset to zero when one of the aforementioned events occurs.

(It is possible that I have missed the magical counter that doesn't get reset like that and which would actually show any latency - if someone has better info please speak up!)
 

Jack@dba

Member
Hi Tom - I have found in target database is matching with Source last Ai extent which archived.

The last block received at: Tue Mar 14 10:39:14 2023
Activity information:
Blocks received: 45588157
Blocks processed: 45588157

Hi Rob - AI file is full up of AI-notes, the timing is this dependent on Transaction activity. Scripted process not able to find anywhere.
 

TomBascom

Curmudgeon
Extents don’t “arrive” and nothing that you are showing demonstrates anything happening in terms of extents.

One would hope that “blocks received” is the same as “blocks processed”. That is the expected state of affairs. It says nothing about whether or not you are in sync. All it says is that the agent has done the work that it knows about. Pull the network cable out of the wall and it will continue to tell you that it has done all of the work that it knows about. It won’t tell you anything about all of the work queued up on the source while the network is disconnected. Or while the replserv just isn’t running. Or when congestion or bad performance result in slow replication transfers.

To know if you are in sync by comparing blocks received & processed you also need to know how many were sent by the source. Do you see that metric anywhere? Hint: you cannot trust the agent to know that, you have to look on the source.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Hi Rob - AI file is full up of AI-notes, the timing is this dependent on Transaction activity. Scripted process not able to find anywhere.
Okay, so it seems you are using fixed-length extents, meaning that switches happen whenever the files fill. This does mean that your recovery point is variable: in periods of slow change activity, you stand to potentially lose more (i.e. older) data in a disaster scenario, as compared with time-based switching.

So then that raises the question of how extent state is managed. What causes a filled AI extent to be copied to an archive directory and the original file marked as empty? Are you using the AI File Management Daemon (aiarchiver)?

This set up is suggested by the progress vendor long back 3 years and to make room for Ai extent to grow.
If you are using fixed-length AI extents, the individual files don't grow. When you said "grow", did you mean "increase in number"? I don't understand this. Are you adding more AI extents over time?
 

TomBascom

Curmudgeon
Just guessing but with 243+ extents I am suspicious that extent changes only occur with nightly backups and that new extents are being added periodically.

But perhaps I am overly cynical this morning.
 

Jack@dba

Member
Rob- To answers to your questions.

What causes a filled AI extent to be copied to an archive directory and the original file marked as empty? Are you using the AI File Management Daemon (aiarchiver)?
Yes we are using AI archiver with on-demand mode only archive once Ai file Fills up and switch to next Extent. So there is question from business if 1 Ai file is completing within 10 mins and other Ai files is completing in 30 to 50 mins. If something wrong with server in middle there will be data loss?

On other side both databases are synchronized even with we can recovery database, am i right?

If you are using fixed-length AI extents, the individual files don't grow. When you said "grow", did you mean "increase in number"? I don't understand this. Are you adding more AI extents over time?

We had issue in DR where one of the target databases lost connection and went to pre-transition state, in source side all AI extent got locked and database went to stall mode. So our progress vendor suggested to increase another 100 extents.
Actual catch is we dont have proper monitoring in place for replication.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
So there is question from business if 1 Ai file is completing within 10 mins and other Ai files is completing in 30 to 50 mins. If something wrong with server in middle there will be data loss?
This was the point of my comment about a disaster. If a disaster happened and for some reason you needed to recover by restoring a backup and rolling forward your AI extents, you would want to be able to manage how much data loss is involved. When your approach is to switch fixed extents whenever they fill up, an AI extent does not represent a unit of time. It could be 10 minutes of data or 6 hours or anything else. If you are switching on a timed basis, and you are able to roll forward all but your last file, you can predict how much you would lose.

On other side both databases are synchronized even with we can recovery database, am i right?
I'm not sure exactly what you mean. Are you saying that you don't need AI files for recovery because you have OE Replication?

We had issue in DR where one of the target databases lost connection and went to pre-transition state, in source side all AI extent got locked and database went to stall mode. So our progress vendor suggested to increase another 100 extents.
Actual catch is we dont have proper monitoring in place for replication.
A disaster could involve losing your production site and any data in it; e.g. fire, flood, explosion, or some other disaster. Adding more AI extents on the source side when your current extents are locked lets you keep writing AI notes but it doesn't improve your recovery position if that data doesn't reach the DR site.

You are correct, the real problem is that you lack replication monitoring. That should be your focus, and getting back to a synchronized state as soon as possible when you lose sync, rather than adding more AI extents.
 

TomBascom

Curmudgeon
So there is question from business if 1 Ai file is completing within 10 mins and other Ai files is completing in 30 to 50 mins. If something wrong with server in middle there will be data loss?

If you lose the source database server in a fire, earthquake, flood, plague of locusts, or armed robbery then your data loss on the target is determined by what has been received on the target server (and the target server is, presumably, in a distinct physical location, not sitting right next to the source).

Obviously your data loss on the source is 100%. Unless you have also been making offsite backups and archiving after-image extents offsite. If you have been doing that then your data loss depends on when the last after-image extent was successfully archived offsite. Presuming that you are doing that, Rob's point is that the length of time that transactions are lost for is highly variable because you are switching extents as they fill rather than on a predictable schedule. And you cannot archive a BUSY extent (you could script something and hope for the best but the AIMGT process won't archive BUSY extents).

When you have LOCKED extents on the source you are "out of sync". How far out of sync depends on how much data is in those locked extents. But they are LOCKED because their contents have not yet made it to the replication target.

If you have zero locked extents and you are part-way through filling an "on-demand" extent your potential data loss on the target is determined by the "lag time" shown on the target. It should be zero seconds, or close to it, if replication is operating normally. That is to say that *both* the source and target report "normal processing" and the "seconds behind" on the target is zero or a small number of seconds. (Yes, I am a broken record on this topic.)

This is because *extents* are not being "sent" to the target. The contents of the after-image extents (aka "notes") that describe changes are being streamed to the target as they occur and the target is using those notes to update the target in real-time. The source is not waiting for you to fill an extent prior to sending stuff to the target. This is possible because after-image extents are written to sequentially. Once something is written to a BUSY ai extent it will not be modified.

It is is very easy to see the streaming nature of replication updates if you have replication+ because you can attach a (read-only) session to the target and read changes to source data as they occur. If you want you can even stage an interruption by disrupting the network and see the changes stop, and then see them cacth up when you plug the network back in.
 
Top