Monitoring and Healthcheck

EJBCA contains a health check service that can be used for health monitoring remotely. This should be considered essential if running EJBCA in a clustered installation, as it can be used to determine whether a node is able to remain in the cluster or needs to be taken out.

The purpose of an Healthcheck is to notify if something is not as expected in order to allow a monitoring system to send alarms and cluster nodes to be taken offline.
An example, and a source of a common misunderstanding, is that putting a CA offline (CA Service State) is expected and will not result in Healthcheck warnings. The motivation is that if you have de-activated a CA, this has been done deliberately, i.e. everything is as it should be, and the Healthcheck should not warn. This is in contrast to when you have the CA activated but the Crypto Token goes offline, then the CA is expected to be online, but it cannot be because the crypto token is offline and therefore Healtcheck should warn.

Note that a configuration as in the following example will thus not result in Healthcheck warnings.

images/download/attachments/134451396/ca-off-line.png


Servlet URL

The servlet is located in the URL: http://localhost:8080/ejbca/publicweb/healthcheck/ejbcahealth

Note that the client (e.g. the load balancer) is responsible for closing the connection to the application server. Failure to do so may result in denial of service, preventing other clients from connecting to EJBCA.

Configuration

Which CAs that are checked by the health check service can be configured in the Admin Web on the CA Activation page as well as in the Edit CA page.

Common Criteria Compliance

To be fully Common Criteria compliant, a different key for signature tests than certificate signing should be used in the CA's HSM token configuration (the "testKey" alias should point to a key with no other uses).

The behavior of the servlet can be modified by configuring the below values in conf/ejbca.properties.

General Configuration

The following configuration parameters may be set to configure authorization and what the service checks:

Key

Default

Description

healthcheck.amountfreemem

1

The amount of memory that must be free on the server, in megabytes.

healthcheck.dbquery

select 1

Parameter indicating the string that should be used to do a minimal check that the database is working.

healthcheck.authorizedips

127.0.0.1

Specifies which remote IPs that may call this healthcheck servlet. Multiple IPs may be separated by a semicolon.

healthcheck.catokensigntest

false

Set to true to perform a test signature on each CA token during the check. Otherwise just checks that the token status is active.

healthcheck.publisherconnections

false

Set to true to perform a health test on all active publisher connections.

Maintenance File Properties

Key

Default

Description

healthcheck.maintenancefile


Location of file containing information about maintenance.

healthcheck.maintenancepropertyname

DOWN_FOR_MAINTENANCE

The key of the property value in the maintenance, should be in the following format: DOWN_FOR_MAINTENANCE=true.

Servlet Configuration

The following parameters configure what message or HTTP error code the health service returns.

Key

Default

Description

healthcheck.okmessage

ALLOK

Text string used to say that everything is ok with this node. Any properties defined properties value can be used here by inserting it in as a property, e.g:
ALLOK ${httpsserver.hostname} Version ${app.version.number}

healthcheck.sendservererror

true

Set to true of the HTTP error code 500 should be sent in case of error.

healthcheck.customerrormessage

null

Allows for a custom error message to be configured.

Error Messages

If an error is detected one or several of the following error messages is reported. All errors will be sent with a response code of 500

Error

Description

MEM: Error Virtual Memory is about to run out, currently free memory : number

The JVM is about to run out of memory

DB: Error creating connection to database

JDBC Connection to the database failed, this might occur if DB crashes or network is down.

CA: Error CA Token is disconnected: CAName

This is a sign of hardware problems with one or several of the hard ca tokens in the node.

MAINT: DOWN_FOR_MAINTENANCE

This is reported when the healthcheck.maintenancefile is used and the node is set to be offline.

Error when testing the connection with publisher: PublisherName

This is reported when a test connection to one of the publishers failed.

Could not perform a test signature on the audit log.

Reported when the audit log failed to sign (if database protection is enabled)

Related Content