Server diagnostics checks the server status and prepares it for a new user.
During the diagnostics:
- equipment characteristics are checked;
- drives are analyzed for S.M.A.R.T. errors via the smartctl utility;
- the local connection speed is checked to eliminate network card errors;
- if necessary, BMC is configured and the server’s hard disks are cleaned.
The results are saved into the DCImanager 6 database.
Some software applications and third-party services for diagnostics are installed on the server with DCImanager 6, others are installed on location. Learn more in the article Locations. General information.
The process includes the following steps:
- Preparing a diagnostics template.
- Uploading the diagnostics template.
- Running diagnostics.
- Completing diagnostics.
The system uses Diag6 template. This template is based on OS SystemRescue CD 6.
The software is uploaded to the server through TFTP and HTTP. The HTTP and TFTP-servers run on the location.
During diagnostics, network settings from the location DHCP-server are passed to the server and the operating system SystemRescueCD is uploaded. When the process is completed, the software will be deleted, the server network settings will be reset.
If the platform has integration with the billing system, during the diagnostic period the server will be assigned an IP address from the pool for server deallocation.
Network settings from the DHCP server are transmitted at the data link layer (L2), further interaction via TFTP and HTTP protocols occurs at the network layer (L3).
The maximum time allotted for diagnostics is 60 minutes. If the diagnostics is not completed within this time, DCImanager 6 will forcibly complete the diagnostic operation.
Preparing the diagnostics template
DCImanager 6 performs the steps:
- Generates the values for parameters and macros of the diagnostics template.
- Sets the configuration file of the DHCP-server.
- Prepares the files to pass through TFTP and HTTP.
- Generates a new BMC connection password if the function Configure BMC is enabled.
Uploading the diagnostics template
- DCImanager 6 reboots the servers.
- The DHCP-server passes to the server under diagnostics the network settings and paths to the files for passing through TFTP.
- The server downloads the configuration file of the iPXE-loader ipxe.conf through TFTP.
- The DHCP-server passes the network settings for the iPXE-loader to the server under diagnostics.
- The iPXE-loader downloads through HTTP the files for running the diagnostics process and the images of the operating system SystemRescueCD. Next, it uploads SystemRescueCD to RAM.
- The DHCP-server passes the network settings to the server under diagnostics for SystemRescueCD.
Running diagnostics
After the template is uploaded, the system will run the autorun diagnostics script. The autorun script:
- Collects information about server equipment and its performance.
- Sends the data to the location.
- Clears the hard drives under one of the conditions:
- if the option Clear SSD and HDD disks during diagnostics is enabled;
- if the macro $CLEAR_HDD or $FULL_HDD_CLEAR with the value "YES" is added to the diagnostic template. See Template macros for more details.
- Configures BMC if the function Configure BMC is enabled:
- Allocates an IP address for BMC. If the connection to BMC is not configured, the IP address will be allocated from the pool selected in the diagnostics settings.
- Removes all created BMC users.
- Creates an administrator account with the specified name.
-
Creates an operator account if a name is specified for it. The server owner will only be able to connect to the BMC under this user.
If the diagnostic fails, existing users will not be removed and new users will not be created.
- Allocates an IP address for BMC. If the connection to BMC is not configured, the IP address will be allocated from the pool selected in the diagnostics settings.
- Reboots or shuts down the server depending on the option selected when starting the diagnostics.
Completing diagnostics
DCImanager 6 performs the following steps:
- Retrieves the information collected from the location during diagnostics.
- Resets the modified configuration files.
- Deletes the directories and files that were created for diagnostics.
- Saves the server configuration into the database.
- Depending on the platform settings, restarts or shuts down the server.
Related topics: