A forced reboot of the DCImanager 6 platform can corrupt the Clickhouse database (DB), and the clickhouse_server container will stop working.
As a result, the server statistics will not be displayed and an error will appear in the interface: Error 10106, Graphite request error, Got response code 503.
Diagnostics
- Connect to the server with the platform via SSH.
-
Connect to the clickhouse_server container:
Container names may differ depending on the used version of Docker Compose. A hyphen may be used instead of the underscore character in container names.
To get the exact names of the containers, run the command:
docker ps -a
docker exec -it clickhouse_server sh
-
Check for the ClickHouse init process failed error in the clickhouse-server.log:
grep 'ClickHouse init process failed' /var/log/clickhouse-server/clickhouse-server.log
If the response indicates errors, it is due to a corrupt Clickhouse database. Example of response:
ClickHouse init process failed.
-
Check the clickhouse-server.err.log in real time:
tail -F /var/log/clickhouse-server/clickhouse-server.err.log
If the Clickhouse database is corrupted, it will display errors like this:
2023.02.07 08:42:05.203655 [ 115 ] {} <Error> dci.graphite (ec566f01-a447-406f-b275-92a2b3cd85ab): Detaching broken part /var/lib/clickhouse/store/ec5/ec566f01-a447-406f-b275-92a2b3cd85ab/202212_43911_57173_11575 (size: 0.00 B). If it happened after update, it is likely because of backward incompatibility. You need to resolve this manually 2023.02.07 08:42:05.211909 [ 115 ] {} <Error> dci.graphite (ec566f01-a447-406f-b275-92a2b3cd85ab): while loading part 202212_43911_57178_11580 on path store/ec5/ec566f01-a447-406f-b275-92a2b3cd85ab/202212_43911_57178_11580: Code: 27. DB::ParsingException: Cannot parse input: expected 'columns format version: 1\n' at end of stream. (CANNOT_PARSE_INPUT_ASSERTION_FAILED), Stack trace (when copying this message, always include the lines below)
Solution
-
Run the database recovery:
docker exec -it clickhouse_server touch /var/lib/clickhouse/flags/force_restore_data
-
Check the clickhouse-server.err.log for errors like:
2023.02.08 03:19:33.367657 [ 546 ] {7d518126-889d-4c64-83bf-f87356dc802a} <Error> DynamicQueryHandler: Cannot send exception to client: Code: 24. DB::Exception: Cannot write to ostream at offset 280. (CANNOT_WRITE_TO_OSTREAM)
You can check it with the command:
cat /var/log/clickhouse-server/clickhouse-server.err.log | grep 'Cannot write to ostream'
-
If the response displays errors, restart the container with the command:
docker stop clickhouse_server; docker start clickhouse_server