Skip to content

Troubleshooting

This document provides troubleshooting steps for resolving various issues encountered in ARS across different scenarios, such as:

  • Configuring issues

  • Problems encountered in production

  • Restarting ARS Services

Follow the steps listed below to address these situations.

Note

Back end access is required to check if the services are running, check the ports status and to make modifications to the configuration file if needed.

Troubleshooting Steps

Expansion of Disk

If the allocated disk space or any virtual disk attached to the ARS VM does not reflect /dev/sda, follow the steps below for disk expansion.

Additionally, this can be done to extend the ARS VM HDD:

  1. Start the ARS VM.

  2. Run the fdisk /dev/sda command to check for available disks.

  3. Press p to list the partition table and note the starting sector of sda1partition.

  4. Press d to delete the partition.

  5. Select partition no - 1.

  6. Press n to create a new one.

  7. Partition type - (p) primary.

  8. Sector start - 2048 or whickever is the first sector.

  9. Sector end - default press enter.

  10. Press w to write to disk.

  11. Run the partprobe /dev/sda and resize2fs /dev/sda1 commands one by one to update the partition table.

  12. Reboot OS and verify if the updated disks are visible using thefdisk -l command.

Configuration Checks

  1. Checking ports and the status of the services.

    1. After ARS VM is booted, run the netstat -plntu command on ARS to check the port status.

    2. The following ports should be in the LISTEN state:

      • 5601- For Opensearch-Dashboards
      • 9200 - For OpenSearch
      • 5044 - For Logstash (HySecure)
      • 514 - For Logstash (HyWorks)
      • 9201 - For Logstash (Metricbeat)

    Note

    Port 5044 is the default port used by HySecure for sending Filebeat logs. However, this port can be customized, so if an organization is using a custom port, verify that it is properly configured and listening.

Important command to check on the HySecure Gateway side for every HySecure node

Use the netstat -ant | grep <ARS_IP> command to check:

  • HySecure should establish a TCP connection with the ARS IP on Logstash customized ports.

Production Issues

Follow the steps below to check for any issues if logs have stopped arriving at ARS in the production environment:

  • Check whether the Logstash port (5044) is in the Listen state. Since this port is customizable, check for the issue using the command: netstat -plntu | grep <logstash_customized_port>

  • Check the Logstash Logs using the command: tail -f /var/logs/logstash/logstash-plain.log | grep error

  • Restart Logstash service if required.

  • Enter the command: ps -ax | grep logstash

    • Note the PID of logstash.

    • Kill the process using the command: kill -9 <PID>

    • Restart the service by running the script: /etc/logstash_start.sh

  • Check HDD and RAM utilization of ARS:

    • For HDD use the command: df -h

    • For RAM use the command: free -h

    Note

    • If the HDD storage is more than 80% used, then delete all the closed indices in the Opensearch log (provided backup is taken before the closed indices are deleted).
    • If you have enabled auto backup, check the ars_backup in the root directory; it has all the backup logs. Move all these logs to another storage to free up the space.
  • Check whether the Opensearch has gone into read-only mode.

  • Enter the command: tail -f /var/logs/opensearch/opensearch.log | grep read-only

    • Disable the read-only mode. Enter the command (with proper username and password): `curl -X PUT "http://admin:password@localhost:9200/_all/_settings?pretty" -H 'Content-Type:application/json' -d'{"index.blocks.read_only_allow_delete": false}'

    • If the logs are still not appearing or if they are stuck in read-only mode, check the Opensearch logs. There might be a message indicating that the flood stage disk watermark [95%] has been exceeded. In this case, it is required to increase the disk size. Use the following command to monitor the logs for warnings: tail -f /var/log/opensearch/opensearch.log | grep WARN

  • Connection timeout issue:

  • Check if there are any rules for iptables; if yes, flush them using the command: iptables -F

Restart ARS Services

  • Restart the Opensearch service using the command: systemctl restart opensearch

Note

Upon restart, Opensearch may take several minutes to fully initialize, depending on the number of open indices stored on the disk.

How to make sure Opensearch is ready

Run the command and check the status of Opensearch. It should be yellow.

curl -x GET 'https://localhost:9200/_cat/health?v -u admin:accopsars -k

Restart Opensearch-Dashboards Service

  • Enter the command: systemctl restart opensearch-dashboards

Restart Logstash service

  • Enter the command: ps -ax | grep logstash

  • Note the PID of the logstash process

  • Kill the process. Enter the command: kill -9 <PID>

  • Restart the service by running the script: /etc/logstash_start.sh