NetBackup vmd 进程不能启动一例

平台:
windows 2003 、NetBackup 6.5.6 media server
现象:
系统起来,检查NetBackup进程发现vmd进程没有启动。同时发现inetd服务也不能启动。
故障排查过程:
1、重新启动NetBackup故障依旧。
2、尝试手动启动inetd服务,发现系统报用户名错误。
3、询问管理最近操作,原来最近有修改超级用户密码。
4、检查inetd服务,该服务由超级用户来启动的。调整服务启动密码。
5、重新启动NetBackup进程,vmd服务正常启动。

NetBackup nbdevquery 命令详解

nbdevquery 命令作用:
该命令主要是检查NetBackup 磁盘类型介质状态,比如openstorage、PureDisk、AdvancedDisk diskpool、storageserver状态。

命令选项:
[-listdp]
[-listdv]
[-liststs]
[-listmediaid]
[-listmounts]
[-listglobals]
[-listconfig]
[-listreptargets]

选项介绍:
-listdv 查看系统里所有disk pool信息。

[root@nbu1 staging]# nbdevquery -listdp
V7.5 dp_nbu1 1 7.55 7.55 1 98 80 -1 nbu1
V7.5 ad_nbu1 1 3.94 3.94 1 98 80 -1 nbu1
表示NetBackup 7.5版本下,有两个disk pool。详细信息可以加-U选项。
-listdv 查看系统disk pool状态,主要包括disk pool 是online or offline。

[root@nbu1 staging]# nbdevquery -listdv -stype PureDisk -U
Disk Pool Name : dp_nbu1
Disk Type : PureDisk
Disk Volume Name : PureDiskVolume
Disk Media ID : @aaaax
Total Capacity (GB) : 7.55
Free Space (GB) : 6.66
Use% : 11
Status : DOWN
Flag : ReadOnWrite
Flag : AdminUp
Flag : InternalDown
Num Read Mounts : 0
Num Write Mounts : 1
Cur Read Streams : 0
Cur Write Streams : 0
Num Repl Sources : 0
Num Repl Targets : 0
显示当前disk pool状态为down。

-liststs 列出当前所有storage server信息

[root@nbu1 staging]# nbdevquery -liststs
V7.5 nbu1 PureDisk 9
V7.5 nbu1 AdvancedDisk 5
显示当前系统下有两个storage server,且类型分别为PureDisk、AdvancedDisk。

-listmediaid 显示media id 下所有disk volume信息。

[root@nbu1 staging]# nbdevquery -listmediaid @aaaax
V7.5 dp_nbu1 PureDisk PureDiskVolume @aaaax 7.55 6.66 11 0 0 1 0 0 6

-listmounts 显示disk pool mount点

[root@nbu1 staging]# nbdevquery -listmediaid @aaaax
V7.5 dp_nbu1 PureDisk PureDiskVolume @aaaax 7.55 6.66 11 0 0 1 0 0 6
[root@nbu1 staging]# nbdevquery -listmounts
Disk Pool dp_nbu1 has 1 Mount Points
PureDiskVolume @ nbu1 (mounted)
Disk Pool ad_nbu1 has 1 Mount Points
/ad @ nbu1 (mounted)
每个disk pool 有一个mount点。
-listglobals 显示SCSI Persistent Reservation 属下。

[root@nbu1 staging]# nbdevquery -listglobals
SCSI Persistent Reservation: 0

-listconfig 显示storage server配置信息。

[root@nbu1 staging]# nbdevquery -listconfig -stype PureDisk -storage_server nbu1
V7.5 “storagepath” “/dp” string
V7.5 “spalogpath” “/dp/log” string
V7.5 “dbpath” “/dp” string
V7.5 “required_interface” “nbu1″ string
V7.5 “spalogretention” “7″ int
V7.5 “verboselevel” “3″ int
V7.5 “replication_target(s)” “none” string
V7.5 “Storage Pool Raw Size” “7.9GB” string
V7.5 “Storage Pool Reserved Space” “322.5MB” string
V7.5 “Storage Pool Size” “7.6GB” string
V7.5 “Storage Pool Used Space” “908.1MB” string
V7.5 “Storage Pool Available Space” “6.7GB” string
V7.5 “Catalog Logical Size” “110Bytes” string
V7.5 “Catalog files Count” “2″ string
V7.5 “Space Used Within Containers” “156Bytes” string
V7.5 “Deduplication Ratio” “0.7″ string

-listreptargets 查看跨备份域复制信息。
nbdevquery -listreptargets -stunit <label> [-U]
#由于我的环境没有配置,所以暂时没输出示例。后续补上。

 

 

 

NetBackup 删除diskpool失败一例

现象:
通过图形界面删除puredisk pool,报如下错误。
failed to delete disk pool, invalid command parameter
从debug里显示如下错误:

22:33:17.555 [13642] <2> dsm_update_diskgroup_state: Calling dsm->updateDiskGroupState()
22:33:17.577 [13642] <16> dsm_update_diskgroup_state: DSM has encountered the following busy resource: dp_nbu1, mount point = PureDiskVolume
22:33:17.577 [13642] <16> dsm_update_diskgroup_state: ServiceException: method=updateDiskGroupState():7512 service=DiskService host=nbu1 errorDomain=DSM errorCode=2050027 errorText=dp_nbu1(PureDisk)@nbu1
22:33:17.577 [13642] <16> modify_disk_group_state: dsm_update_diskgroup_state call failed, bp_status = 20
22:33:17.577 [13642] <16> deletedg: failed to DOWN the disk pool (bp_status = 20), so can’t delete it, returning
22:33:17.577 [13642] <2> nbdevconfig: operation returned status = 20
22:33:17.577 [13642] <16> DevConfigCLI::analyzeOp: failed to delete disk pool, invalid command parameter
22:33:17.579 [13642] <2> nbdevconfig: Exiting, status = 20

 

排除步骤:
1、确保 相关STU已经删除。
2、确保在这个STU上的image都已经过期。
关于第二点可以使用以下方法:
a)使用catalog查看,查找所有的image都已经过期了。
b)nbstlutil list -U  #确保SLP里涉及的image都已经过期。
3、确保SLP里没有使用该diskpool的策略。

从我的系统里,我检查所有的信息都没有了。但是还是删除不掉。由于从catalog里找不到任何image信息,怀疑为image clean时出现异常。尝试手动delete 过期信息。
#nbdelete -allvolumes
执行了这条命令,好像还是不行。查找网上相关信息,需要加force选项。
#nbdelete -allvolumes -force
命令执行完成。尝试使用命令行删除设备信息,正常完成。

[root@nbu1 nbu1]# nbdevconfig -deletedp -dp dp_nbu1 -stype PureDisk
Disk pool dp_nbu1 has been deleted successfully

版权所有快备份
转载请标明www.keifen.com

STATUS CODE 5: Attempts to restore the SQL master database to a new server fail with a NetBackup Status Code 5 (the restore failed to recover the requested files).

Problem

STATUS CODE 5: Attempts to restore the SQL master database to a new server fail with a NetBackup Status Code 5 (the restore failed to recover the requested files).

Solution

Overview:  Attempts to restore the SQL master database to a new server fail with a NetBackup Status Code 5 (the restore failed to recover the requested files).

Troubleshooting: Enable the dbclient log file on the SQL server.

Log files:
The dbclient log file shows the following error message:
16:23:14.443 [2832.5432] <16> CODBCaccess::LogODBCerr: DBMS MSG - ODBC return code <-1>, SQL State <37000>, SQL Message <3168><[Microsoft][ODBC SQL Server Driver][SQL Server]The backup of the system database on device VNBU0-2832-5432-1179865295 cannot be restored because it was created by a different version of the server (134218488) than this server (134219767).>.

Resolution:
As detailed on the Microsoft website in knowledge base article 264474 (link below) it is not possible to restore a system database to a server with a different build level from the original source server.

http://support.microsoft.com/kb/264474

A comprehensive list of solutions for the most common NetBackup for Microsoft SQL Server database agent backup and restore issues

Problem

A comprehensive list of solutions for the most common NetBackup for Microsoft SQL Server database agent backup and restore issues

Solution

1. How to restore to an alternate client, same client, with a different DB name, with a move script, to a Cluster, SQL Transaction Logs Refer to the following TechNotes to address these issues:

2. How to perform a backup of SQL, SQL in a Cluster, SQL Transaction Logs, and perform a cold backup.  
Refer to the following TechNotes to resolve these issues:

3. Backup or restore failed with error 1, 2, 5, 236. 239, or 58; SQL DB in Loading state after restore due to typing Mistakes in the Backup or Restore Script.  
Refer to the following TechNotes for details and resolutions:
Other possible causes of backup or restore failures are as follows:
  • SQLHOST keyword in the script is pointing to the wrong host
  • Wrong master name in the script
  • BROWSE CLIENT = <virtual name> instead of the node name
  • SQLHOST specified in capital letters
  • Wrong DB name
  • Wrong SQLINSTANCE keyword in the script
  • BROWSE CLIENT in upper\lower case, must reflect the name in the Master if the Master is Unix
  • Database in loading state after restore: RECOVEREDSTATE was set to NOTRECOVERED
  • An ordinary restore script was used for an alternate client restore, the move script should be used instead
  • BROWSECLIENT keyword is missing
4. Backup or restore fails with error: 2,5,23,58,48, or backup can hang due to incorrect name resolution or Network issues.  Refer to the following for troubleshooting and most frequent causes:

Explanation of bpclntcmd command options, the system calls being used, and recommended troubleshooting when the commands return errors:  http://symantec.com/docs/TECH50198
Status Code 23 during client backups or restores, or when loading client properties:  http://symantec.com/docs/TECH57100
Use the bpclntcmd to troubleshoot the following problems:
  • Incorrect DNS settings
  • Incorrect reverse lookup
  • Missing or incorrect IP address in the host file of the Client, Media or Master server.
5. Backup or restore fails with error 1, 2, 25; restore fails with error 5, or Error “Exclusive access could not be obtained because the database is in use” due to 3rd party application problems
Refer to the following TechNotes to address these issues:
  • Getting error “Exclusive access could not be obtained because the database is in use” when attempting to restore database over different database using a move template:  http://symantec.com/docs/TECH59128
  • Attempts to restore a SQL database fail with a Status Code 5.  The NetBackup MS SQL Client “View Status” window shows the following message: “Exclusive access could not be obtained because the database is in use”:  http://symantec.com/docs/TECH44445
  • SQL 2000 or SQL 2005 user database restore fails with the error “Exclusive access could not be obtained because the database is in use” when single user mode is already set on the database that is being restored:  http://symantec.com/docs/TECH18466
  • “Exclusive access could not be obtained because the database is in use” when performing a Microsoft SQL 2000 or SQL 2005 restore:  http://symantec.com/docs/TECH16063
  • NetBackup for Microsoft SQL Server database backup exits with Status Code 6, and a status 995 is reported in the SQL Server errorlog:  http://symantec.com/docs/TECH5970
  • After adding a new client to an MS-SQL-Server policy, the new client fails with a Status Code 2 and, in the dbclient log file, the message “The requested name is valid, but no data of the requested type was found” is shown:  http://symantec.com/docs/TECH44647
Other possible causes for failure:
  • Wrong SA user account password specified in the SQL Agent properties
  • No disk space
6. Backup error 1, 2, 240, 199, Error 2: USER – Operation inhibited by NetBackup
for Microsoft SQL Server: Only a full backup can be performed on the master database due to incorrect policies configuration
Refer to the following TechNotes for details and resolutions to these errors:
7. Backup failed with error 2, 167, backup or restore error 25, and bplist command error 133 due to parameters that can be changed via a GUI
Refer to the following TechNotes for details and resolutions:

Most frequent causes:
Changed NetBackup (NBU) client service account to a working one, or one with the correct SQL rights
Selected “allow client browse” via Host Properties\Master
Added media server host name to client servers list
Corrected wrong master name in the client Registry
Corrected wrong client name in BAR GUI

During an alternate client restore, the error “ERROR Initializing NetBackup Catalog” occurs, launching the SQL Backup History Options GUI:  http://symantec.com/docs/TECH27039

8. Backup or restore error 41 or restore failed with error 5 due to needed tuning

Refer to the following TechNotes for details and resolutions:
  • With some SQL issues, increasing the Client Read Timeout on the SQL client up to 36000 seconds will help.
  • Performance tuning for NetBackup for Microsoft SQL Server backups:  http://symantec.com/docs/TECH33423
  • How to back up multiple Microsoft SQL Server databases in parallel using more than one tape drive: http://symantec.com/docs/TECH18392
  • Restores of large Microsoft SQL server databases using the NetBackup for Microsoft SQL Server database extension fail before jobs start reading data from tape:  http://symantec.com/docs/TECH14997
  • How to troubleshoot Microsoft SQL Server database restore issues:   http://symantec.com/docs/TECH39006
  • Is it possible for SQL databases backed up with more than one stripe to be restored using fewer stripes when using the NetBackup for Microsoft SQL Server database agent?  http://symantec.com/docs/TECH48409
  • Changes to the NetBackup for SQL Microsoft SQL Server database agent allow a multi-striped image to be restored with a single stripe:  http://symantec.com/docs/TECH49125
9. Restore failed with error 25 or 13 because the BAR GUI was used to launch the restore instead of SQL Agent GUI
Refer to the following TechNotes for details and resolutions:

Legacy ID

331936

Article URL http://www.symantec.com/docs/TECH74475

 

Terms of use for this information are found in Legal Notices  

Considerations when replacing libobk/orasbt when updating NetBackup for Oracle

Problem

Special coordination may be required to ensure that the NetBackup Client and NetBackup for Oracle libraries are properly updated when upgrading or applying a hotfix.

Solution

Overview:

Oracle RMAN uses the Serial Backup Tape (SBT) API to perform backup to tape devises.  The NetBackup Oracle extension is an implementation of the SBT API.

Upgrading the SBT API, can present some challenges for an application that runs 24 x 7.  The information below should be reviewed and well understood before planning the installation or upgrade of NetBackup on an Oracle host.

The nature of running processes is that, by default, external references are resolved and the relevant shared object libraries read from disk and mapped into the running process space only once during the life of a process.  Thus a process that runs continuously and performs a backup every day typically does not reload libraries before each backup.  Consequently, the only way to force the process to load an updated copy of a library is by stopping and restarting the process.  Hence the challenge to a 24 x 7 application.

Recommendations:

Follow these steps to perform a successful upgrade of NetBackup on an Oracle client host.  This applies to upgrading the NetBackup Oracle extension and the NetBackup Client whose libraries are used by the extension.  Prior to NetBackup 7.0, these are separately installed components and both should always be upgraded at the same time and to the same maintenance pack or release update level.  Starting with NetBackup 7.0, the NetBackup Client install automatically includes the NetBackup for Oracle extension.

Please note that all references to ‘sbt operations’ encompasses backup, restore, and catalog maintenance operations.

1) Stop all processes for the Oracle instances on the host.  Some may have the old libraries mapped into process space.  If there is more than one instance and all are using NetBackup, then all should be stopped.

2) Stop the Oracle listener process if sbt operations have been performed using TNS aliases since NetBackup was last installed or upgraded.  In that configuration, the Oracle listener spawns the process that will do the sbt operation and it too will likely have the old libraries mapped into process space.

3) On HP-UX, the files on disk are the backing store for the running process and may be locked, causing any attempt to overwrite the files to fail.  Check if the files are in use and terminate any processes that are using them prior to updating the libraries.

$ fuser /usr/openv/lib/libxbsa*
$ fuser /usr/openv/netbackup/bin/libobk*

4) On AIX, the old library may already be in the library cache.  New or existing processes will look in the library cache first and may not load the new libraries from disk when resolving external references.  If all the Oracle processes noted above have been halted, clear the cache.

$ /usr/sbin/slibclean

5) On Windows, locate all ‘*xbsa*.dll‘ and ‘orasbt.dll‘ files and delete them.  The install will reinstall the new copies in the appropriate places and the older ones will no longer be inadvertently found higher in the search PATH when resolving external references.

6) Perform the install or upgrade per the NetBackup software distribution instructions.

7) After the install, inspect the output from the following commands to confirm that the expected version of the files are installed.

$ cd /usr/openv
$ cat netbackup/bin/version
$ cat share/*oebu*
$ ls -1 lib/libxbsa* netbackup/bin/libobk* \
 | while read fn ; do
   netbackup/bin/goodies/support/versioninfo -f $fn
 done

Note that the versioninfo program has been included in the NetBackup server distribution since NetBackup 6.0, but was not added to the client distribution until the 6.5.4 release update.  It can be copied from a server of the same platform type as the client.

On Windows, locate the files and check their properties.

8) Following the install, ensure that Oracle is properly using the newly installed libraries by follow the steps in TECH72307 in the Related Articles section.

Final Notes:

Newer versions of Oracle (9i and above) should dynamically load and unload the SBT library as needed, but in rare instance reportedly do not.  The following recommendations have been found to be useful in the past.

A) Consistently use SBT_LIBRARY for all SBT operations.  This will cause an explicit dlopen system call to locate and read the library file when the channel is allocated.  Then when the channel is released, an explicit dlclose system call will unload the library from the process space so that it can be reloaded from disk, when the channel is allocated for the next backup or restore.  I.e.

ALLOCATE CHANNEL … TYPE SBT_TAPE PARMS=’SBT_LIBRARY=/usr/openv/netbackup/bin/<appropriate_libobk>’;

On AIX, be aware that the old library will still be referenced by the library cache.  But if all sbt operations specified SBT_LIBRARY and are complete, the use count will be 0 so slibclean will remove it from the cache.

B) Avoid using a TNS alias to connect to the target database when the database is local to the host that is running RMAN.  Using an alias causes the Oracle listener to create the Oracle server process.  The listener may be running as a different user than the instance to backup or restore, which may have a different $ORACLE_HOME, which will cause a different path to be searched for libobk, which may cause an unexpected libobk to be loaded and used.  See the Related Articles for details regarding Oracle 11g.

How to confirm that Oracle is loading the correct NBU Oracle extension library files for use

Problem

How to confirm that Oracle is loading the correct NBU Oracle extension library files for use?

Solution

When the Oracle RMAN program performs a backup, restore, or catalog maintenance operation using the SBT API, it will utilize a libobk* shared object library or orasbt.dll.  This library is provided by third-party backup software, including the NetBackup (NBU) Oracle extension.  The NBU Oracle extension is dependent upon the xbsa library provided with the NBU Client.

Below is a process for confirming if the correct library files are installed and being utilized.  These examples are for Unix, but the process is the same for Windows and the details are at the bottom of this document.

1) Shutdown the Oracle instance.

2) If TNS aliases have or will be used by RMAN to connect to the instance(s) then also shutdown the listener.

3) Confirm all Oracle processes are down.

$ ps -ef | grep -i ora

4) On AIX, also clear the library cache.  This will only work if all Oracle process that utilize the libobk are down and the library use counter has decremented to 0.

$ /usr/sbin/slibclean

Note: Confirming that the Oracle process are down is significant!  Once the Oracle instance or listener starts and attempts an SBT API operation, it loads the then current library files from disk into memory.  The running process should unload the library when not needed and then load a new copy when needed, but in rare instances may not.  If that happens, the Oracle instance may remain ignorant of updated copies on disk and continue to use the older copy already loaded into process space.  See TECH72419 in the Related Articles section for additional details.

5) Capture the last access times on the library files defined by Oracle, NBU Oracle, and NBU Client.

(
ls -lu $ORACLE_HOME/lib*/libobk*
ls -lu /usr/openv/netbackup/bin/libobk*
ls -lu /usr/openv/lib/libxbsa*
) > /tmp/nbu-lib-access.before.out

6) Restart the Oracle instance.

7) Perform a backup, restore or catalog maintenance operation using RMAN.

8) Capture the last access times on the library files again.

(
ls -lu $ORACLE_HOME/lib*/libobk*
ls -lu /usr/openv/netbackup/bin/libobk*
ls -lu /usr/openv/lib/libxbsa*
) > /tmp/nbu-lib-access.after.out

9) Compare the output files from steps 5 and 8.

The access time on one of the NBU libobk.* files and one of the NBU libxbsa files should have been updated.  The access time on the libobk.* in one of the lib, lib32, or lib64 subdirectories below $ORACLE_HOME may also have been updated.

If the access times did not update on the expected library files then either the Oracle instance configuration or the RMAN PARMS statement in the backup/restore/maintenance script specifies an alternate location for SBT_LIBRARY or another libobk* file exists higher on the LD_LIBRARY_PATH (Solaris & Linux), SHLIB_PATH (HP-UX), or LIBPATH (AIX).  The DBA will be familiar with the Oracle library load search process and can make the adjustments so that it uses the correct file.

If the access times updated on an unexpected files, then the DBA will also need to correct the Oracle library load search process.  This may involve specifying SBT_LIBRARY, deleting older libraries that are higher on the search path, or symbolically linking the instance to the NBU appropriate libobk.  To build the correct symbolic links, use this script.

$ /usr/openv/netbackup/bin/oracle_link

If the access times did not update on any of the libobk files in the Oracle or NBU directories updated, then it is likely that RMAN is connecting to an Oracle instance running on another host instead of on this host.  The DBA should check if the target instance is being accessed via a TNS alias and if the alias is resolving correctly.

Note: Starting with NetBackup 6.5.6 (or any hotfix to support Oracle 11g R2), the NBU libobk is statically linked with the xbsa library and the update time on the xbsa library will not change during the operations above.
10) After the symbolic links have been corrected, repeat steps 1-9 to ensure the correct library file is in use.

11) If the correct files are being accessed, but the libraries still will not load, then check the NetBackup Database Compatibility matrix to ensure that the platform, architecture (32 or 64 bit), and version of Oracle that is in use is also supported by the version of NetBackup that is installed.

12) For the next few days or weeks, enable the dbclient debug log and monitor for these lines.  The build dates for the xbsa library and NetBackup for Oracle should be the same or only a few days different.  If they differ significantly, investigate if a mismatched older library is present and in use by some process.
$ cd /usr/openv/netbackup/logs/dbclient
$ egrep ‘NetBackup XBSA Interface|NetBackup for Oracle’ log.??????
08:30:33.508 [11019] <4> VxBSAInit: Veritas NetBackup XBSA Interface – 7.1  2011020313
Veritas NetBackup for Oracle – Release 7.1 (2011020313)
08:34:13.996 [12828] <4> VxBSAInit: Veritas NetBackup XBSA Interface – 7.1  2011020313
Veritas NetBackup for Oracle – Release 7.1 (2011020313)
09:04:48.405 [26337] <4> VxBSAInit: Veritas NetBackup XBSA Interface – 7.1  2011020313
Veritas NetBackup for Oracle – Release 7.1 (2011020313)

For comparison, these are the build dates for libxbsa and libobk for the NetBackup 6.5 and 7.x release.

Veritas NetBackup XBSA Interface – 6.5  2007111605 6.5.1
Veritas NetBackup for Oracle – Release 6.5 (2007111606)
    Veritas NetBackup XBSA Interface – 6.5  2008052300 6.5.2
Veritas NetBackup for Oracle – Release 6.5 (2008052301)
    Veritas NetBackup XBSA Interface – 6.5  2009120409 6.5.3
(No NB Oracle fixes in 6.5.3, should be using NB_ORA_6.5.2.)
    Veritas NetBackup XBSA Interface – 6.5  2009050105  6.5.4
Veritas NetBackup for Oracle – Release 6.5 (2009050106)
    Veritas NetBackup XBSA Interface – 6.5  2009110613  6.5.5
(No NB Oracle fixes in 6.5.5, should be using NB_ORA_6.5.4.)
    Veritas NetBackup XBSA Interface – 6.5  2010042404 6.5.6
Veritas NetBackup for Oracle – Release 6.5 (2010042405)
    Veritas NetBackup XBSA Interface – 7.0  2010010418 7.0
Veritas NetBackup for Oracle – Release 7.0 (2010010418)
    Veritas NetBackup XBSA Interface – 7.0  2010070723 7.0.1
Veritas NetBackup for Oracle – Release 7.0 (2010070723)
    Veritas NetBackup XBSA Interface – 7.1  2011020313 7.1
Veritas NetBackup for Oracle – Release 7.1 (2011020313)
    Veritas NetBackup XBSA Interface – 7.1  2011061213 7.1.0.1
Veritas NetBackup for Oracle – Release 7.1 (2011061213)
    Veritas NetBackup XBSA Interface – 7.1  2011082510 7.1.0.2
Veritas NetBackup for Oracle – Release 7.1 (2011082510)

If the build dates are still not correct, then enable the RMAN Channel Trace and inspect the resulting trace file in the user dump destination to see where the library is being loaded from.  E.g.

$ cat udump/oracle9_ora_21811.trc
…snip…
try loading : libobk.so
Loaded (/app/oracle/lib/libobk.so)

Windows Specific Information:

  • The library files of interest are *xbsa*.dll and orasbt.dll.
  • Symbolic links do not exist so there may be multiple copies of either file on the host.  Find and remove all copies of the library files and then reinstall NetBackup so there is only one copy.  The reinstall will place the libraries in the correct location so there isn’t any need for an oracle_link script on Windows.
  • The library file access times can be viewed in the File/Computer Explorer, but the ‘Date Accessed’ column must be added to the display.

NetBackup NAS NDMP 备份过程

  1. nbpem调度相关备份作业。
  2. EMM选择空闲的磁带机,并分配磁带。并把load磁带请求发给ltid。
  3. 服务器 ltid 服务发送NDMP客户端相应scsi指令,如果机械臂由meida server管理,则相应命令发送给管理机械臂服务器。
  4. 磁带load完成后,NetBackup发送相应备份指令。数据可以写到本地磁带机、其他NAS磁带机、media server存储单元。
  5. NAS设备通过ndmp协议不断发送备份信息到master server,并且记录的擦talog中。
  6. NAS设备备份完成后,反馈备份状态信息。

 

NetBackup Media Server Deduplication (MSDP) configuration via Storage Server Configuration Wizard fails RDSM has encountered an STS error

Problem

NetBackup Media Server Deduplication (MSDP) configuration via ‘Storage Server Configuration Wizard’ fails ‘RDSM has encountered an STS error: failed to update the storage server configuration due to unsupported platform, invalid configuration or system error’, and spad won’t start.

Error

Environment

- NetBackup 7.x Media Server Deduplicaton (MSDP) media server
- the operating system (OS) and NetBackup application were lost due to root disk corruption and/or server hardware failure
- Deduplication storage and database reside on disk storage that is unaffected by the loss of the operating system and NetBackup application.

 

Cause

po_list_2 file is resident in the <dedup_db_path>\databases\catalog\po_list_2 (or <dedup_db_path>/databases/catalog/po_list_2 on UNIX/Linux) directory preventing spad from starting.

Solution

On the affected MSDP mediasvr, confirm the issue is caused by a resident po_list_2 file:

1) Open a command prompt to the pdde directory (typically <install_path>\Veritas\pdde\ on Windows or /usr/openv/pdde on UNIX/Linux) and run:

spad –trace -v
Warning: 22: Version mismatch: spad is version 6.6.0.45901 while libdct is version 6.6.0.35883
Info: set thread[ [0000000002969130]: ] max log size to 0
Info: set entire process max log size to 0
Info: recover po list [F:\msdp_db\databases\catalog\po_list_2]
Trace: _crConfigLoadA: loading configuration from F:\msdp\etc\puredisk\spa.cfg
Trace: _crSSLConfigLoadA: loading configuration from F:\msdp\etc\puredisk\spa.cfg
Trace: F:\msdp\etc\puredisk\spa.cfg: SSL:CAFilename configuration entry not found, skipped
Trace: WSRequestExt: submitting &request=9&login=agent_1627000000&passwd=0800624850a662c56181f6b47f00ea56&action=getRout
ingTables
Trace: <NULL>: SSL:PrivateKeyFilename configuration entry not found, skipped
Trace: <NULL>: SSL:CertFilename configuration entry not found, skipped
Trace: <NULL>: SSL:CAFilename configuration entry not found, skipped
Trace: Set NetLookupHost timeout to 0
Trace: Connecting to SPA route 0 f 127.0.0.1:10102
Trace: routeAlloc: parsing route: 0 f 127.0.0.1:10102
Trace: Copied 127.0.0.1 to 127.0.0.1
Trace: _crRouteAdd: obtaining list of local IPv4 addresses
Trace: _crRouteCheckLocal: Checking if gateway 127.0.0.1 matches a local IP address
Trace: _crRouteCheckLocal: checking 127.0.0.1 -> 127.0.0.1
Trace: Loading route for local IP address 127.0.0.1 from line 0 of
Trace: CRControlConnect: connecting to 127.0.0.1
Trace: _crSessionCheck: called
Trace: _crSessionCheck: setting up session to 127.0.0.1
Trace: _crGwConnect: setting up connection to 127.0.0.1:10102
Trace: TCP: TCP_NODELAY set
Trace: TCP: SO_KEEPALIVE set
Trace: TCP: using a 8192 byte receive buffer.
Trace: TCP: using a 8192 byte send buffer.
Trace: _crGwConnect: connect to: 127.0.0.1:10102
Trace: CRMapError: called by _crGwConnect (e:\pbe\darrieus-pdde-7.0.1-cft\x86_64\darrieus-pdde-7.0.1-cft\src\svn\libs\cl
ibs\libcr\route.c:172), errno = 10061
Error: 25053: Could not establish a connection to 127.0.0.1:10102: connect failed (No connection could be made because t
he target machine actively refused it. )
Error: 25053: Connection failed connection actively refused
Trace: CRShutdown: start
Trace: _crRouteDelete: deleting route to 127.0.0.1
Trace: _crGwClose: closing gateway 127.0.0.1
Trace: _crGwClose: gateway 127.0.0.1 closed: no error
Trace: _crRouteDelete: route to 127.0.0.1 deleted
Trace: CRShutdown: done
Error: 25053: Could not retrieve routing tables: webservice failed (connection actively refused)
Error: 25053: _getRoutingTable failed
Trace: CRShutdown: start
Trace: CRShutdown: done
Error: 25044: can’t get deref cr ctx. [2]
Error: 25044: Could not delete do with poid
Error: 25044: can’t delete cdo from CR [d8be9f63f8e55184405a2627f5316a74]
Error: 25044: could not delete file po F:\msdp_db\databases\catalog\2\netbackup-host1\FilesvrBackup\netbackup-host1
300265421_C1_F1.info
Error: 25044: can’t update entry [2|/svr-netbackup-1/Online-Catalog-Backup|svr-netbackup-1_1300265421_C1_F1.info|0|||||3
|D|0|0100600|0||260|1300265444|1300265444|1300265444|246||||0||PDVFS_2_F_0_ID_6_RT_0|0
]
Error: 25044: can’t recover [F:\msdp_db\databases\catalog\po_list_2]. (not initialized)
Error: 25044: can’t recover [po_list]. Please check the log.
Error: 26016: Catalog: Recover failure.

 

Move the po_list_2 file to an alternate directory.

NetBackup STS error 2060012: call should be repeated

Problem

Job details

10/10/2012 7:57:55 PM – Critical bptm(pid=1920) sts_close_handle failed: 2060012 call should be repeated
10/10/2012 7:57:55 PM – Critical bptm(pid=1920) image close failed: error 2060012: call should be repeated
10/10/2012 7:57:55 PM – Info bptm(pid=1920) EXITING with status 84 <———-

bptm logs:
16:22:57.462 [32054] <2> 6638534:bptm:32054:xxx10: [DEBUG] PDVFS: pdvfs_lib_log: Received binary message from xxx10.host.name.com:10102: REPLY 2425346386 11 1: 4
16:22:57.462 [32054] <2> 6638534:bptm:32054:xxx10: [DEBUG] PDVFS: pdvfs_lib_log: _crDataGetSimple: 128 bytes
16:22:57.462 [32054] <2> 6638534:bptm:32054:xxx10: [DEBUG] PDVFS: pdvfs_lib_log: readn: socket recv want 128 bytes, recv 128 bytes
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: pdvfs_lib_log: WSRequestPOST failed: unknown error (4)
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: _pdvfs_cas_send_po_list: MBPOAddList failed: unknown error (4)
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: pdvfs_cas_sync_po_list: Error _pdvfs_cas_send_po_list failed: unknown error (4)
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDVFS: pdvfs_set_mb_import: pdvfs_cas_sync_po_list failed: Input/output error (5)
16:22:57.462 [32054] <4> 6638534:bptm:32054:xxx10: [INFO] PDVFS: PdvfsRead: return fd=4 res=6
16:22:57.462 [32054] <16> 6638534:bptm:32054:xxx10: [ERROR] PDSTS: sync_pdvfs: sync using </ebs10#1/.sync> failed, expected <COMPLETED>, found <FAILED> (2060012:call should be repeated)

We can see by the log entry:
PDVFS: pdvfs_lib_log: Received binary message from xxx10.host.name.com:10102: REPLY 2425346386 11 1: 4
PDVFS: pdvfs_lib_log: readn: socket recv want 128 bytes, recv 128 bytes

A connection to the spad (port 10102) was carried out and a message received.

Next step is examing the spad.log

Error

EXIT STATUS 84

and

error 2060012: call should be repeated

Environment

2 x NetBackup 7.5.0.3 Win2k8 R2 SP1 media servers, both configured as MSDP.

Cause

spad.log at startup shows the following:

October 11 18:38:36 INFO: 25002: cannot open file: D:\MSDP\databases\catalog\2\Ho-xxxx-customer-hostname-removed_1348848026_C1_HDR.info[R_1] (no such object)
: :
October 11 18:38:36 ERR: 25002: can’t recover [D:\MSDP\databases\catalog\po_list_2]. (no such object)
October 11 18:38:36 ERR: 25002: can’t recover [po_list]. Please check the log.
October 11 18:38:36 WARNING: 25000: Recovery of catalog failed during startup, will recover again in run time!
October 11 18:38:36 INFO: CR mode in spa db is normal !

Solution

Shutdown NetBackup and moved the D:\MSDP\databases\catalog\po_list_2 to another location and restarted NetBackup