平台:
windows 2003 、NetBackup 6.5.6 media server
现象:
系统起来,检查NetBackup进程发现vmd进程没有启动。同时发现inetd服务也不能启动。
故障排查过程:
1、重新启动NetBackup故障依旧。
2、尝试手动启动inetd服务,发现系统报用户名错误。
3、询问管理最近操作,原来最近有修改超级用户密码。
4、检查inetd服务,该服务由超级用户来启动的。调整服务启动密码。
5、重新启动NetBackup进程,vmd服务正常启动。
NetBackup nbdevquery 命令详解
nbdevquery 命令作用:
该命令主要是检查NetBackup 磁盘类型介质状态,比如openstorage、PureDisk、AdvancedDisk diskpool、storageserver状态。
命令选项:
[-listdp]
[-listdv]
[-liststs]
[-listmediaid]
[-listmounts]
[-listglobals]
[-listconfig]
[-listreptargets]
选项介绍:
-listdv 查看系统里所有disk pool信息。
[root@nbu1 staging]# nbdevquery -listdp
V7.5 dp_nbu1 1 7.55 7.55 1 98 80 -1 nbu1
V7.5 ad_nbu1 1 3.94 3.94 1 98 80 -1 nbu1
表示NetBackup 7.5版本下,有两个disk pool。详细信息可以加-U选项。
-listdv 查看系统disk pool状态,主要包括disk pool 是online or offline。
[root@nbu1 staging]# nbdevquery -listdv -stype PureDisk -U
Disk Pool Name : dp_nbu1
Disk Type : PureDisk
Disk Volume Name : PureDiskVolume
Disk Media ID : @aaaax
Total Capacity (GB) : 7.55
Free Space (GB) : 6.66
Use% : 11
Status : DOWN
Flag : ReadOnWrite
Flag : AdminUp
Flag : InternalDown
Num Read Mounts : 0
Num Write Mounts : 1
Cur Read Streams : 0
Cur Write Streams : 0
Num Repl Sources : 0
Num Repl Targets : 0
显示当前disk pool状态为down。
-liststs 列出当前所有storage server信息
[root@nbu1 staging]# nbdevquery -liststs
V7.5 nbu1 PureDisk 9
V7.5 nbu1 AdvancedDisk 5
显示当前系统下有两个storage server,且类型分别为PureDisk、AdvancedDisk。
-listmediaid 显示media id 下所有disk volume信息。
[root@nbu1 staging]# nbdevquery -listmediaid @aaaax
V7.5 dp_nbu1 PureDisk PureDiskVolume @aaaax 7.55 6.66 11 0 0 1 0 0 6
-listmounts 显示disk pool mount点
[root@nbu1 staging]# nbdevquery -listmediaid @aaaax
V7.5 dp_nbu1 PureDisk PureDiskVolume @aaaax 7.55 6.66 11 0 0 1 0 0 6
[root@nbu1 staging]# nbdevquery -listmounts
Disk Pool dp_nbu1 has 1 Mount Points
PureDiskVolume @ nbu1 (mounted)
Disk Pool ad_nbu1 has 1 Mount Points
/ad @ nbu1 (mounted)
每个disk pool 有一个mount点。
-listglobals 显示SCSI Persistent Reservation 属下。
[root@nbu1 staging]# nbdevquery -listglobals
SCSI Persistent Reservation: 0
-listconfig 显示storage server配置信息。
[root@nbu1 staging]# nbdevquery -listconfig -stype PureDisk -storage_server nbu1
V7.5 “storagepath” “/dp” string
V7.5 “spalogpath” “/dp/log” string
V7.5 “dbpath” “/dp” string
V7.5 “required_interface” “nbu1″ string
V7.5 “spalogretention” “7″ int
V7.5 “verboselevel” “3″ int
V7.5 “replication_target(s)” “none” string
V7.5 “Storage Pool Raw Size” “7.9GB” string
V7.5 “Storage Pool Reserved Space” “322.5MB” string
V7.5 “Storage Pool Size” “7.6GB” string
V7.5 “Storage Pool Used Space” “908.1MB” string
V7.5 “Storage Pool Available Space” “6.7GB” string
V7.5 “Catalog Logical Size” “110Bytes” string
V7.5 “Catalog files Count” “2″ string
V7.5 “Space Used Within Containers” “156Bytes” string
V7.5 “Deduplication Ratio” “0.7″ string
-listreptargets 查看跨备份域复制信息。
nbdevquery -listreptargets -stunit <label> [-U]
#由于我的环境没有配置,所以暂时没输出示例。后续补上。
NetBackup 删除diskpool失败一例
现象:
通过图形界面删除puredisk pool,报如下错误。
failed to delete disk pool, invalid command parameter
从debug里显示如下错误:
22:33:17.555 [13642] <2> dsm_update_diskgroup_state: Calling dsm->updateDiskGroupState()
22:33:17.577 [13642] <16> dsm_update_diskgroup_state: DSM has encountered the following busy resource: dp_nbu1, mount point = PureDiskVolume
22:33:17.577 [13642] <16> dsm_update_diskgroup_state: ServiceException: method=updateDiskGroupState():7512 service=DiskService host=nbu1 errorDomain=DSM errorCode=2050027 errorText=dp_nbu1(PureDisk)@nbu1
22:33:17.577 [13642] <16> modify_disk_group_state: dsm_update_diskgroup_state call failed, bp_status = 20
22:33:17.577 [13642] <16> deletedg: failed to DOWN the disk pool (bp_status = 20), so can’t delete it, returning
22:33:17.577 [13642] <2> nbdevconfig: operation returned status = 20
22:33:17.577 [13642] <16> DevConfigCLI::analyzeOp: failed to delete disk pool, invalid command parameter
22:33:17.579 [13642] <2> nbdevconfig: Exiting, status = 20
排除步骤:
1、确保 相关STU已经删除。
2、确保在这个STU上的image都已经过期。
关于第二点可以使用以下方法:
a)使用catalog查看,查找所有的image都已经过期了。
b)nbstlutil list -U #确保SLP里涉及的image都已经过期。
3、确保SLP里没有使用该diskpool的策略。
从我的系统里,我检查所有的信息都没有了。但是还是删除不掉。由于从catalog里找不到任何image信息,怀疑为image clean时出现异常。尝试手动delete 过期信息。
#nbdelete -allvolumes
执行了这条命令,好像还是不行。查找网上相关信息,需要加force选项。
#nbdelete -allvolumes -force
命令执行完成。尝试使用命令行删除设备信息,正常完成。
[root@nbu1 nbu1]# nbdevconfig -deletedp -dp dp_nbu1 -stype PureDisk
Disk pool dp_nbu1 has been deleted successfully
版权所有快备份
转载请标明www.keifen.com
STATUS CODE 5: Attempts to restore the SQL master database to a new server fail with a NetBackup Status Code 5 (the restore failed to recover the requested files).
Problem
Solution
Troubleshooting: Enable the dbclient log file on the SQL server.
Log files:
The dbclient log file shows the following error message:
16:23:14.443 [2832.5432] <16> CODBCaccess::LogODBCerr: DBMS MSG - ODBC return code <-1>, SQL State <37000>, SQL Message <3168><[Microsoft][ODBC SQL Server Driver][SQL Server]The backup of the system database on device VNBU0-2832-5432-1179865295 cannot be restored because it was created by a different version of the server (134218488) than this server (134219767).>.
Resolution:
As detailed on the Microsoft website in knowledge base article 264474 (link below) it is not possible to restore a system database to a server with a different build level from the original source server.
A comprehensive list of solutions for the most common NetBackup for Microsoft SQL Server database agent backup and restore issues
Problem
A comprehensive list of solutions for the most common NetBackup for Microsoft SQL Server database agent backup and restore issues
Solution
- Performing a Database Move to the same SQLHOST or to an alternate HOST: http://symantec.com/docs/TECH35995
- Step-by-step configuration of NetBackup 6.0 for SQL 2005 full backups for single and multiple instances of SQL server: http://symantec.com/docs/TECH51052
- Step-by-step configuration of NetBackup 6.0 for SQL 2000/2005 Full backups Using the STRIPES parameter: http://symantec.com/docs/TECH51061
- If an SQL MOVE script is generated on the destination server, the GUI will incorrectly enter the SQLHOST variable: http://symantec.com/docs/TECH23330
- Step-by-step configuration of NetBackup 6.0 for SQL 2000/2005 differential backups: http://symantec.com/docs/TECH51055
- Step-by-step configuration of NetBackup 6.0 for SQL 2000/2005 transaction log backups: http://symantec.com/docs/TECH51059
- Step-by-step configuration of NetBackup 6.0 for SQL2000/2005 full backups using the BATCHSIZE parameter: http://symantec.com/docs/TECH51060
- How to perform a NetBackup for SQL Server alternate server restore operation of a Microsoft SQL Server database file/filegroups backup: http://symantec.com/docs/TECH35344
- How to manually create a SQL database restore script when having issues creating the restore script using the NetBackup MS SQL Client: http://symantec.com/docs/TECH34223
- How to restore to an alternate client a Microsoft SQL Server database that includes a differential image using the NetBackup MS SQL Client: http://symantec.com/docs/TECH34693
- How to restore a Microsoft SQL Server 7.0 or 2000 database to an alternate location with the NetBackup for Microsoft SQL Server database agent using a UNC path in the restore script: http://symantec.com/docs/TECH24927
- How to perform an Redirected Microsoft SQL Server database restore: http://symantec.com/docs/TECH17567
- How to troubleshoot NetBabckup for Microsoft SQL Server database restore issues: http://symantec.com/docs/TECH39006
2. How to perform a backup of SQL, SQL in a Cluster, SQL Transaction Logs, and perform a cold backup. Refer to the following TechNotes to resolve these issues:
- Configuration of full database backups with NetBackup 5.x for Microsoft SQL Server: http://symantec.com/docs/TECH35665
- How to configure backups and restores with NetBackup for Microsoft SQL Server database agent: http://symantec.com/docs/TECH39418
- Step-by-step configuration of NetBackup 6.x for SQL 2005 full backups for single and multiple instances of SQL server: http://symantec.com/docs/TECH51052
- How to create a single MS-SQL-Server NetBackup policy which contains multiple schedules with different retention levels: http://symantec.com/docs/TECH90544
- Step-by-step configuration of NetBackup 6.x for SQL 2000/2005 differential backups: http://symantec.com/docs/TECH51055
- Step-by-step configuration of NetBackup 6.x for SQL 2000/2005 transaction log backups: http://symantec.com/docs/TECH51059
- NetBackup 6.0 for Microsoft SQL Server System Administrator’s Guide: http://symantec.com/docs/TECH43949
- How to backup Microsoft SQL databases on a SQL server with multiple SQL instances: http://symantec.com/docs/TECH50665
3. Backup or restore failed with error 1, 2, 5, 236. 239, or 58; SQL DB in Loading state after restore due to typing Mistakes in the Backup or Restore Script. Refer to the following TechNotes for details and resolutions:
- SQL RESTORE fails with Error ‘RESTORE detected an error on page (10244:1799427905) in database “<DatabaseName>” as read from the backup set’: http://symantec.com/docs/TECH63983
- NetBackup for Microsoft SQL Server backups fail with Status Code 239 when attempting to back up a SQL Cluster: http://symantec.com/docs/TECH42954
- Unable to restore SQL database when the master server is on Unix or Linux: http://symantec.com/docs/TECH63773
- SQLHOST keyword in the script is pointing to the wrong host
- Wrong master name in the script
- BROWSE CLIENT = <virtual name> instead of the node name
- SQLHOST specified in capital letters
- Wrong DB name
- Wrong SQLINSTANCE keyword in the script
- BROWSE CLIENT in upper\lower case, must reflect the name in the Master if the Master is Unix
- Database in loading state after restore: RECOVEREDSTATE was set to NOTRECOVERED
- An ordinary restore script was used for an alternate client restore, the move script should be used instead
- BROWSECLIENT keyword is missing
- Incorrect DNS settings
- Incorrect reverse lookup
- Missing or incorrect IP address in the host file of the Client, Media or Master server.
Refer to the following TechNotes to address these issues:
- Getting error “Exclusive access could not be obtained because the database is in use” when attempting to restore database over different database using a move template: http://symantec.com/docs/TECH59128
- Attempts to restore a SQL database fail with a Status Code 5. The NetBackup MS SQL Client “View Status” window shows the following message: “Exclusive access could not be obtained because the database is in use”: http://symantec.com/docs/TECH44445
- SQL 2000 or SQL 2005 user database restore fails with the error “Exclusive access could not be obtained because the database is in use” when single user mode is already set on the database that is being restored: http://symantec.com/docs/TECH18466
- “Exclusive access could not be obtained because the database is in use” when performing a Microsoft SQL 2000 or SQL 2005 restore: http://symantec.com/docs/TECH16063
- NetBackup for Microsoft SQL Server database backup exits with Status Code 6, and a status 995 is reported in the SQL Server errorlog: http://symantec.com/docs/TECH5970
- After adding a new client to an MS-SQL-Server policy, the new client fails with a Status Code 2 and, in the dbclient log file, the message “The requested name is valid, but no data of the requested type was found” is shown: http://symantec.com/docs/TECH44647
- Wrong SA user account password specified in the SQL Agent properties
- No disk space
for Microsoft SQL Server: Only a full backup can be performed on the master database due to incorrect policies configuration
Refer to the following TechNotes for details and resolutions to these errors:
- NetBackup for Microsoft SQL Server database agent backup failed with Status Code 199: http://symantec.com/docs/TECH66372
- NetBackup for SQL Server transaction log backup exits with Status Code 1: http://symantec.com/docs/TECH57408
- Clarification to “Application Backup schedule” in the NetBackup for Microsoft SQL Server System Administrator’s Guide: http://symantec.com/docs/TECH17460
- Additional information about frequency and calendar based scheduling: http://symantec.com/docs/TECH37128
- How to back up multiple Microsoft SQL Server databases in parallel using more than one tape drive: http://symantec.com/docs/TECH18392
- Performance tuning for NetBackup for Microsoft SQL Server backups: http://symantec.com/docs/TECH33423
Refer to the following TechNotes for details and resolutions:
Most frequent causes:
Changed NetBackup (NBU) client service account to a working one, or one with the correct SQL rights
Selected “allow client browse” via Host Properties\Master
Added media server host name to client servers list
Corrected wrong master name in the client Registry
Corrected wrong client name in BAR GUI
8. Backup or restore error 41 or restore failed with error 5 due to needed tuning
Refer to the following TechNotes for details and resolutions:
- With some SQL issues, increasing the Client Read Timeout on the SQL client up to 36000 seconds will help.
- Performance tuning for NetBackup for Microsoft SQL Server backups: http://symantec.com/docs/TECH33423
- How to back up multiple Microsoft SQL Server databases in parallel using more than one tape drive: http://symantec.com/docs/TECH18392
- Restores of large Microsoft SQL server databases using the NetBackup for Microsoft SQL Server database extension fail before jobs start reading data from tape: http://symantec.com/docs/TECH14997
- How to troubleshoot Microsoft SQL Server database restore issues: http://symantec.com/docs/TECH39006
- Is it possible for SQL databases backed up with more than one stripe to be restored using fewer stripes when using the NetBackup for Microsoft SQL Server database agent? http://symantec.com/docs/TECH48409
- Changes to the NetBackup for SQL Microsoft SQL Server database agent allow a multi-striped image to be restored with a single stripe: http://symantec.com/docs/TECH49125
Refer to the following TechNotes for details and resolutions:
- NetBackup 7.1 for Microsoft SQL Server Administrator’s Guide: http://symantec.com/docs/DOC3670
- NetBackup 7.0 for Microsoft SQL Server Administrator’s Guide: http://symantec.com/docs/TECH127055
- NetBackup 6.5 for Microsoft SQL Server Administrator’s Guide: http://symantec.com/docs/TECH52812
- NetBackup 6.0 for Microsoft SQL Server Administrator’s Guide: http://symantec.com/docs/TECH43949
Related Articles
Legacy ID
Article URL http://www.symantec.com/docs/TECH74475
Terms of use for this information are found in Legal Notices
Considerations when replacing libobk/orasbt when updating NetBackup for Oracle
Problem
Special coordination may be required to ensure that the NetBackup Client and NetBackup for Oracle libraries are properly updated when upgrading or applying a hotfix.
Solution
Oracle RMAN uses the Serial Backup Tape (SBT) API to perform backup to tape devises. The NetBackup Oracle extension is an implementation of the SBT API.
Upgrading the SBT API, can present some challenges for an application that runs 24 x 7. The information below should be reviewed and well understood before planning the installation or upgrade of NetBackup on an Oracle host.
The nature of running processes is that, by default, external references are resolved and the relevant shared object libraries read from disk and mapped into the running process space only once during the life of a process. Thus a process that runs continuously and performs a backup every day typically does not reload libraries before each backup. Consequently, the only way to force the process to load an updated copy of a library is by stopping and restarting the process. Hence the challenge to a 24 x 7 application.
Recommendations:
Follow these steps to perform a successful upgrade of NetBackup on an Oracle client host. This applies to upgrading the NetBackup Oracle extension and the NetBackup Client whose libraries are used by the extension. Prior to NetBackup 7.0, these are separately installed components and both should always be upgraded at the same time and to the same maintenance pack or release update level. Starting with NetBackup 7.0, the NetBackup Client install automatically includes the NetBackup for Oracle extension.
Please note that all references to ‘sbt operations’ encompasses backup, restore, and catalog maintenance operations.
1) Stop all processes for the Oracle instances on the host. Some may have the old libraries mapped into process space. If there is more than one instance and all are using NetBackup, then all should be stopped.
2) Stop the Oracle listener process if sbt operations have been performed using TNS aliases since NetBackup was last installed or upgraded. In that configuration, the Oracle listener spawns the process that will do the sbt operation and it too will likely have the old libraries mapped into process space.
3) On HP-UX, the files on disk are the backing store for the running process and may be locked, causing any attempt to overwrite the files to fail. Check if the files are in use and terminate any processes that are using them prior to updating the libraries.
$ fuser /usr/openv/lib/libxbsa*
$ fuser /usr/openv/netbackup/bin/libobk*
4) On AIX, the old library may already be in the library cache. New or existing processes will look in the library cache first and may not load the new libraries from disk when resolving external references. If all the Oracle processes noted above have been halted, clear the cache.
$ /usr/sbin/slibclean
5) On Windows, locate all ‘*xbsa*.dll‘ and ‘orasbt.dll‘ files and delete them. The install will reinstall the new copies in the appropriate places and the older ones will no longer be inadvertently found higher in the search PATH when resolving external references.
6) Perform the install or upgrade per the NetBackup software distribution instructions.
7) After the install, inspect the output from the following commands to confirm that the expected version of the files are installed.
$ cd /usr/openv
$ cat netbackup/bin/version
$ cat share/*oebu*
$ ls -1 lib/libxbsa* netbackup/bin/libobk* \
| while read fn ; do
netbackup/bin/goodies/support/versioninfo -f $fn
done
Note that the versioninfo program has been included in the NetBackup server distribution since NetBackup 6.0, but was not added to the client distribution until the 6.5.4 release update. It can be copied from a server of the same platform type as the client.
On Windows, locate the files and check their properties.
8) Following the install, ensure that Oracle is properly using the newly installed libraries by follow the steps in TECH72307 in the Related Articles section.
Final Notes:
A) Consistently use SBT_LIBRARY for all SBT operations. This will cause an explicit dlopen system call to locate and read the library file when the channel is allocated. Then when the channel is released, an explicit dlclose system call will unload the library from the process space so that it can be reloaded from disk, when the channel is allocated for the next backup or restore. I.e.
ALLOCATE CHANNEL … TYPE SBT_TAPE PARMS=’SBT_LIBRARY=/usr/openv/netbackup/bin/<appropriate_libobk>’;
On AIX, be aware that the old library will still be referenced by the library cache. But if all sbt operations specified SBT_LIBRARY and are complete, the use count will be 0 so slibclean will remove it from the cache.
B) Avoid using a TNS alias to connect to the target database when the database is local to the host that is running RMAN. Using an alias causes the Oracle listener to create the Oracle server process. The listener may be running as a different user than the instance to backup or restore, which may have a different $ORACLE_HOME, which will cause a different path to be searched for libobk, which may cause an unexpected libobk to be loaded and used. See the Related Articles for details regarding Oracle 11g.
How to confirm that Oracle is loading the correct NBU Oracle extension library files for use
Problem
How to confirm that Oracle is loading the correct NBU Oracle extension library files for use?
Solution
Below is a process for confirming if the correct library files are installed and being utilized. These examples are for Unix, but the process is the same for Windows and the details are at the bottom of this document.
1) Shutdown the Oracle instance.
2) If TNS aliases have or will be used by RMAN to connect to the instance(s) then also shutdown the listener.
3) Confirm all Oracle processes are down.
$ ps -ef | grep -i ora
4) On AIX, also clear the library cache. This will only work if all Oracle process that utilize the libobk are down and the library use counter has decremented to 0.
$ /usr/sbin/slibclean
Note: Confirming that the Oracle process are down is significant! Once the Oracle instance or listener starts and attempts an SBT API operation, it loads the then current library files from disk into memory. The running process should unload the library when not needed and then load a new copy when needed, but in rare instances may not. If that happens, the Oracle instance may remain ignorant of updated copies on disk and continue to use the older copy already loaded into process space. See TECH72419 in the Related Articles section for additional details.
5) Capture the last access times on the library files defined by Oracle, NBU Oracle, and NBU Client.
(
ls -lu $ORACLE_HOME/lib*/libobk*
ls -lu /usr/openv/netbackup/bin/libobk*
ls -lu /usr/openv/lib/libxbsa*
) > /tmp/nbu-lib-access.before.out
6) Restart the Oracle instance.
7) Perform a backup, restore or catalog maintenance operation using RMAN.
8) Capture the last access times on the library files again.
(
ls -lu $ORACLE_HOME/lib*/libobk*
ls -lu /usr/openv/netbackup/bin/libobk*
ls -lu /usr/openv/lib/libxbsa*
) > /tmp/nbu-lib-access.after.out
9) Compare the output files from steps 5 and 8.
The access time on one of the NBU libobk.* files and one of the NBU libxbsa files should have been updated. The access time on the libobk.* in one of the lib, lib32, or lib64 subdirectories below $ORACLE_HOME may also have been updated.
If the access times did not update on the expected library files then either the Oracle instance configuration or the RMAN PARMS statement in the backup/restore/maintenance script specifies an alternate location for SBT_LIBRARY or another libobk* file exists higher on the LD_LIBRARY_PATH (Solaris & Linux), SHLIB_PATH (HP-UX), or LIBPATH (AIX). The DBA will be familiar with the Oracle library load search process and can make the adjustments so that it uses the correct file.
If the access times updated on an unexpected files, then the DBA will also need to correct the Oracle library load search process. This may involve specifying SBT_LIBRARY, deleting older libraries that are higher on the search path, or symbolically linking the instance to the NBU appropriate libobk. To build the correct symbolic links, use this script.
$ /usr/openv/netbackup/bin/oracle_link
If the access times did not update on any of the libobk files in the Oracle or NBU directories updated, then it is likely that RMAN is connecting to an Oracle instance running on another host instead of on this host. The DBA should check if the target instance is being accessed via a TNS alias and if the alias is resolving correctly.
11) If the correct files are being accessed, but the libraries still will not load, then check the NetBackup Database Compatibility matrix to ensure that the platform, architecture (32 or 64 bit), and version of Oracle that is in use is also supported by the version of NetBackup that is installed.
$ egrep ‘NetBackup XBSA Interface|NetBackup for Oracle’ log.??????
08:30:33.508 [11019] <4> VxBSAInit: Veritas NetBackup XBSA Interface – 7.1 2011020313
Veritas NetBackup for Oracle – Release 7.1 (2011020313)
08:34:13.996 [12828] <4> VxBSAInit: Veritas NetBackup XBSA Interface – 7.1 2011020313
Veritas NetBackup for Oracle – Release 7.1 (2011020313)
09:04:48.405 [26337] <4> VxBSAInit: Veritas NetBackup XBSA Interface – 7.1 2011020313
Veritas NetBackup for Oracle – Release 7.1 (2011020313)
For comparison, these are the build dates for libxbsa and libobk for the NetBackup 6.5 and 7.x release.
Veritas NetBackup for Oracle – Release 6.5 (2007111606)
Veritas NetBackup for Oracle – Release 6.5 (2008052301)
(No NB Oracle fixes in 6.5.3, should be using NB_ORA_6.5.2.)
Veritas NetBackup for Oracle – Release 6.5 (2009050106)
(No NB Oracle fixes in 6.5.5, should be using NB_ORA_6.5.4.)
Veritas NetBackup for Oracle – Release 6.5 (2010042405)
Veritas NetBackup for Oracle – Release 7.0 (2010010418)
Veritas NetBackup for Oracle – Release 7.0 (2010070723)
Veritas NetBackup for Oracle – Release 7.1 (2011020313)
Veritas NetBackup for Oracle – Release 7.1 (2011061213)
Veritas NetBackup for Oracle – Release 7.1 (2011082510)
If the build dates are still not correct, then enable the RMAN Channel Trace and inspect the resulting trace file in the user dump destination to see where the library is being loaded from. E.g.
$ cat udump/oracle9_ora_21811.trc
…snip…
try loading : libobk.so
Loaded (/app/oracle/lib/libobk.so)
- The library files of interest are *xbsa*.dll and orasbt.dll.
- Symbolic links do not exist so there may be multiple copies of either file on the host. Find and remove all copies of the library files and then reinstall NetBackup so there is only one copy. The reinstall will place the libraries in the correct location so there isn’t any need for an oracle_link script on Windows.
- The library file access times can be viewed in the File/Computer Explorer, but the ‘Date Accessed’ column must be added to the display.
NetBackup NAS NDMP 备份过程
- nbpem调度相关备份作业。
- EMM选择空闲的磁带机,并分配磁带。并把load磁带请求发给ltid。
- 服务器 ltid 服务发送NDMP客户端相应scsi指令,如果机械臂由meida server管理,则相应命令发送给管理机械臂服务器。
- 磁带load完成后,NetBackup发送相应备份指令。数据可以写到本地磁带机、其他NAS磁带机、media server存储单元。
- NAS设备通过ndmp协议不断发送备份信息到master server,并且记录的擦talog中。
- NAS设备备份完成后,反馈备份状态信息。
NetBackup Media Server Deduplication (MSDP) configuration via Storage Server Configuration Wizard fails RDSM has encountered an STS error
Problem
NetBackup Media Server Deduplication (MSDP) configuration via ‘Storage Server Configuration Wizard’ fails ‘RDSM has encountered an STS error: failed to update the storage server configuration due to unsupported platform, invalid configuration or system error’, and spad won’t start.
Error
Environment
- NetBackup 7.x Media Server Deduplicaton (MSDP) media server
- the operating system (OS) and NetBackup application were lost due to root disk corruption and/or server hardware failure
- Deduplication storage and database reside on disk storage that is unaffected by the loss of the operating system and NetBackup application.
Cause
po_list_2 file is resident in the <dedup_db_path>\databases\catalog\po_list_2 (or <dedup_db_path>/databases/catalog/po_list_2 on UNIX/Linux) directory preventing spad from starting.
Solution
On the affected MSDP mediasvr, confirm the issue is caused by a resident po_list_2 file:
1) Open a command prompt to the pdde directory (typically <install_path>\Veritas\pdde\ on Windows or /usr/openv/pdde on UNIX/Linux) and run:
spad –trace -v
Warning: 22: Version mismatch: spad is version 6.6.0.45901 while libdct is version 6.6.0.35883
Info: set thread[ [0000000002969130]: ] max log size to 0
Info: set entire process max log size to 0
Info: recover po list [F:\msdp_db\databases\catalog\po_list_2]
Trace: _crConfigLoadA: loading configuration from F:\msdp\etc\puredisk\spa.cfg
Trace: _crSSLConfigLoadA: loading configuration from F:\msdp\etc\puredisk\spa.cfg
Trace: F:\msdp\etc\puredisk\spa.cfg: SSL:CAFilename configuration entry not found, skipped
Trace: WSRequestExt: submitting &request=9&login=agent_1627000000&passwd=0800624850a662c56181f6b47f00ea56&action=getRout
ingTables
Trace: <NULL>: SSL:PrivateKeyFilename configuration entry not found, skipped
Trace: <NULL>: SSL:CertFilename configuration entry not found, skipped
Trace: <NULL>: SSL:CAFilename configuration entry not found, skipped
Trace: Set NetLookupHost timeout to 0
Trace: Connecting to SPA route 0 f 127.0.0.1:10102
Trace: routeAlloc: parsing route: 0 f 127.0.0.1:10102
Trace: Copied 127.0.0.1 to 127.0.0.1
Trace: _crRouteAdd: obtaining list of local IPv4 addresses
Trace: _crRouteCheckLocal: Checking if gateway 127.0.0.1 matches a local IP address
Trace: _crRouteCheckLocal: checking 127.0.0.1 -> 127.0.0.1
Trace: Loading route for local IP address 127.0.0.1 from line 0 of
Trace: CRControlConnect: connecting to 127.0.0.1
Trace: _crSessionCheck: called
Trace: _crSessionCheck: setting up session to 127.0.0.1
Trace: _crGwConnect: setting up connection to 127.0.0.1:10102
Trace: TCP: TCP_NODELAY set
Trace: TCP: SO_KEEPALIVE set
Trace: TCP: using a 8192 byte receive buffer.
Trace: TCP: using a 8192 byte send buffer.
Trace: _crGwConnect: connect to: 127.0.0.1:10102
Trace: CRMapError: called by _crGwConnect (e:\pbe\darrieus-pdde-7.0.1-cft\x86_64\darrieus-pdde-7.0.1-cft\src\svn\libs\cl
ibs\libcr\route.c:172), errno = 10061
Error: 25053: Could not establish a connection to 127.0.0.1:10102: connect failed (No connection could be made because t
he target machine actively refused it. )
Error: 25053: Connection failed connection actively refused
Trace: CRShutdown: start
Trace: _crRouteDelete: deleting route to 127.0.0.1
Trace: _crGwClose: closing gateway 127.0.0.1
Trace: _crGwClose: gateway 127.0.0.1 closed: no error
Trace: _crRouteDelete: route to 127.0.0.1 deleted
Trace: CRShutdown: done
Error: 25053: Could not retrieve routing tables: webservice failed (connection actively refused)
Error: 25053: _getRoutingTable failed
Trace: CRShutdown: start
Trace: CRShutdown: done
Error: 25044: can’t get deref cr ctx. [2]
Error: 25044: Could not delete do with poid
Error: 25044: can’t delete cdo from CR [d8be9f63f8e55184405a2627f5316a74]
Error: 25044: could not delete file po F:\msdp_db\databases\catalog\2\netbackup-host1\FilesvrBackup\netbackup-host1
300265421_C1_F1.info
Error: 25044: can’t update entry [2|/svr-netbackup-1/Online-Catalog-Backup|svr-netbackup-1_1300265421_C1_F1.info|0|||||3
|D|0|0100600|0||260|1300265444|1300265444|1300265444|246||||0||PDVFS_2_F_0_ID_6_RT_0|0
]
Error: 25044: can’t recover [F:\msdp_db\databases\catalog\po_list_2]. (not initialized)
Error: 25044: can’t recover [po_list]. Please check the log.
Error: 26016: Catalog: Recover failure.
Move the po_list_2 file to an alternate directory.
NetBackup STS error 2060012: call should be repeated
Problem
Job details
10/10/2012 7:57:55 PM – Critical bptm(pid=1920) sts_close_handle failed: 2060012 call should be repeated
10/10/2012 7:57:55 PM – Critical bptm(pid=1920) image close failed: error 2060012: call should be repeated
10/10/2012 7:57:55 PM – Info bptm(pid=1920) EXITING with status 84 <———-
bptm logs:
16:22:57.462 [32054] <2> 6638534:bptm:32054:xxx10: [DEBUG] PDVFS: pdvfs_lib_log: Received binary message from xxx10.host.name.com:10102: REPLY 2425346386 11 1: 4
16:22:57.462 [32054] <2> 6638534:bptm:32054:xxx10: [DEBUG] PDVFS: pdvfs_lib_log: _crDataGetSimple: 128 bytes
16:22:57.462 [32054] <2> 6638534:bptm:32054:xxx10: [DEBUG] PDVFS: pdvfs_lib_log: readn: socket recv want 128 bytes, recv 128 bytes
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: pdvfs_lib_log: WSRequestPOST failed: unknown error (4)
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: _pdvfs_cas_send_po_list: MBPOAddList failed: unknown error (4)
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: pdvfs_cas_sync_po_list: Error _pdvfs_cas_send_po_list failed: unknown error (4)
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: pdvfs_set_mb_import: pdvfs_cas_sync_po_list failed: Input/output error (5)
16:22:57.462 [32054] <4> 6638534:bptm:32054:xxx10: [INFO] PDVFS: PdvfsRead: return fd=4 res=6
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDSTS: sync_pdvfs: sync using </ebs10#1/.sync> failed, expected <COMPLETED>, found <FAILED> (2060012:call should be repeated)
We can see by the log entry:
PDVFS: pdvfs_lib_log: Received binary message from xxx10.host.name.com:10102: REPLY 2425346386 11 1: 4
PDVFS: pdvfs_lib_log: readn: socket recv want 128 bytes, recv 128 bytes
A connection to the spad (port 10102) was carried out and a message received.
Next step is examing the spad.log
Error
EXIT STATUS 84
and
error 2060012: call should be repeated
Environment
Cause
spad.log at startup shows the following:
October 11 18:38:36 INFO: 25002: cannot open file: D:\MSDP\databases\catalog\2\Ho-xxxx-customer-hostname-removed_1348848026_C1_HDR.info[R_1] (no such object)
: :
October 11 18:38:36 ERR: 25002: can’t recover [D:\MSDP\databases\catalog\po_list_2]. (no such object)
October 11 18:38:36 ERR: 25002: can’t recover [po_list]. Please check the log.
October 11 18:38:36 WARNING: 25000: Recovery of catalog failed during startup, will recover again in run time!
October 11 18:38:36 INFO: CR mode in spa db is normal !
Solution
Shutdown NetBackup and moved the D:\MSDP\databases\catalog\po_list_2 to another location and restarted NetBackup