I am investigating the cause of a problem we had with an undo tablespace
and would appreciate hearing your thoughts on this. While I have fixed
the problem by creating a new replacement undo tablespace, and
everything now seems fine, I am interested in why the problem occurred,
and whether there is a better solution than the one I employed.
We are running IBM Rational ClearQuest on Oracle 10g R2. On Monday
ClearQuest users started getting the error ORA-01552
: cannot use system
rollback segment for non-system tablespace 'CQ_USER' when trying to add
a new entry within ClearQuest.
Full error message from ClearQuest
ERROR! SQLExecute: RETCODE=-1, State=HY000, Native Error=1552
SQL statement="update dbglobal set next_request_id=906 where site_id=2
[DataDirect][ODBC Oracle Wire Protocol driver][Oracle]ORA-01552
use system rollback segment for non-system tablespace 'CQ_USER'.
Upon investigation using Enterprise Manager, it appeared that the
CQ_USER tablespace had no space left - a critical error for this was
logged on the preceding Friday - and it was not set to autoextend. We
rectified this by changing it to autoextend in Enterprise Manager and
applying the changes, thus clearing the tablespace critical errors.
(please don't kill me for the tablespace being full, I'm just helping
out an ex-employer, monitoring the database was not my responsibility)
However, this did not resolve the initial Oracle error shown in ClearQuest, so we investigated further.
On Thursday/Friday, the alert error log in BDUMP logged a few ORA-1653
errors relating to tables that could not be extended in the (then)
almost full CQ_USER tablespace.
On the Saturday, there was the following:
KCF: write/open error block 0x29 online=1
error=27072 text: 'OSD-04008: WriteFile() failure, unable to write to file
O/S-Error: (OS 33) The process cannot access the file because another
process has locked a portion of the file.'
Automatic datafile offline due to write error on
file 2: D:\ORACLE\ORADATA\...\UNDOTBS01.DBF
Since then - even after extending the CQ_USER tablespace on Monday - the
system logged many of these:
Errors in file c:\oracle\admin\...\..._pmon_2604.trc:ORA-00376
: file 2 cannot be read at this timeORA-01110
: data file 2: 'D:\ORACLE\ORADATA\...\UNDOTBS01.DBF'
Enterprise Manager claimed the undo tablespace (UNDOTBS1) was ReadWrite
status, and 0MB in size. When I used the SQLPlus command line to confirm
the status of this tablespace it was listed as AVAILABLE. Dbverify on
the file UNDOTBS01.DBF showed nothing failing or corrupt, but Enterprise
Manager did list this file as "Offline (Needs Recovery)". Based on this,
and some information I found online, I decided to create a new undo
tablespace and switched the database to use that instead. The ClearQuest
error no longer occurs.
1. Was the undo datafile being locked/going offline a result of the
tablespace maxing out, or could it have been an unrelated error? (note
that we have Symantec backups running daily on the database server and
database, I seem to remember reading somewhere that backup software can
sometimes cause datafile problems?)
2. Was recreating the undo tablespace the best solution, or should I
have done something else?
Many thanks for any assistance you can offer,