Question Can reboot machine while Db is killed?

WorkInProgress

New Member
If this is the only way, CAN I kill the DB by killing the PID and then reboot the OS (Solaris 5.10) and then start the DB? Will this corrupt the DB? I am doing this because, webspeed does not start and DB's dont shutdown normal/unconditional/emergency from PROMON or proshut.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Let me see if I understand your situation: you are currently unable to even shut down your database via proshut. What happens when you try? What do you see in the DB log?

But instead of trying to figure out whether you have database corruption, or how to fix this issue, you want to learn how to work around this by killing the database broker? If it were me I'd be on the phone to Progress TS immediately. You're paying for support (I assume), you might as well take advantage of it.

Yes, killing the broker will shut down the database, though I wouldn't do that. Before you do anything else, get the clients logged out (and keep them out) and back up the database. Then restore the backup somewhere else and open that DB to prove to yourself that it is usable. If you can take a quiet point (I don't know if they exist in 9.1E) then do that and you can take an OS copy of your DB files as well. I think you should start with the assumption that you have corruption, either in your DB or in your Progress installation, or both. Copy both of those backups off the box as well. Before you shut this DB down you may want to dump the schema and DB contents to flat files as well.

In answer to your question, crashing the database (e.g. by killing the broker or deleting the lock file) shouldn't corrupt the database. Progress crash recovery is pretty good. And rebooting the box won't corrupt the database, if you don't count the unlikely scenario that the reboot triggers an OS file system check which introduces file system corruption and possibly (further) database corruption. That's why it's a good idea to have copies of the DB somewhere other than this suspect machine.

I would suggest rebuilding this database, on a different server if at all possible, via ASCII dump and load. Having a DB that doesn't shut down is obviously not a healthy situation. And I cannot stress enough that you should involve Progress TS rather than trying to take this on yourself. If you're not on support you should probably talk to a reputable Progress DB consultant. Good luck.
 

WorkInProgress

New Member
Its a test server. No realtime users. No error was reported before the issue came. No update in WS, NS or messenger/admserv log from past 3-4 days. I did check in logs and found only the following error at some places in DB logs for older dates:


bkioRead:Unknown O/S error during Read, errno 2, fd 412, len 4096, offset 0, file DBI23453. (9451)
Previous message sent on behalf of user 13. (5512)

Also i did kill the DB while troubleshooting and they did crash recovery when started and started successfully but as NS is not able to find WSAdmin service its not starting and any attempt to query results in a hung request.

This all started when someone kicked off a .p sometime back. UNIX guy killed the user but the issue still persists.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
So you have already restarted the DB. What is the state of it now? Does proshut shut it down normally?

Also, the bkioRead error indicates a problem with the client reading its temp files. It tried to read the DBI file and the client OS threw an error to Progress. This could be caused by disk or file system issues. If the temp files reside on a remote file system then it could also be caused by network issues.
 

WorkInProgress

New Member
no change in situation after DB started again. I have issued proshut db_name -by command 2 days back for one of the DB for testing and the last line of logs still reads:

Code:
15:39:43 BROKER  0: Multi-user session begin. (333)
15:39:43 BROKER  0: Begin Physical Redo Phase at 4096 . (5326)
15:39:44 BROKER  0: Physical Redo Phase Completed at blk 6225 off 788 upd 86457. (7161)
15:39:44 BROKER  0: Started for test_db using TCP, pid 3824. (5644)
15:39:44 WDOG    6: Started. (2518)
15:39:44 BIW     7: Started. (2518)
15:39:44 APW     8: Started. (2518)
15:39:44 APW     9: Started. (2518)
15:40:33 SHUT   10: Server shutdown started by progress on /dev/pts/21. (542)
16:12:33 SHUT   10: HANGUP signal received. (562)
 

TomBascom

Curmudgeon
The shutdown was interrupted by a hangup -- it was either killed or the window containing it was closed.
 
Top