Skip to content

Image: logo_diskover_data_tm_header_no_background.png


Diskover Technical Troubleshooting


© Diskover Data, Inc. All rights reserved. All information in this manual is subject to change without notice. No part of the document may be reproduced or transmitted in any form, or by any means, electronic or mechanical, including photocopying or recording, without the express written permission of Diskover Data, Inc.


Diskover Indexers


Diskover Indexer for Linux

Log Debug

🔴  Enable debug logging in config:

vi /root/.config/diskover/config.yaml

🔴  Set logLevel to DEBUG and enable logging to a file by setting logToFile to True:

logLevel: DEBUG
logToFile: True
logDirectory: /tmp/

🔴  Alternatively you can run and redirect all stdout/stderr output to a log file:

python3 diskover.py ... > /var/log/diskover.log 2>&1

Log Warnings

urllib3.connectionpool - WARNING - Connection pool is full, discarding connection: localhost

If you are seeing this Elasticsearch warning, there are two things you can try.

  • In your diskover config, lower the maxthreads to be something like 16 or 20.
  • Set your maxsize setting to be higher for Elasticsearch connections to 40 or more.

Diskover Indexer for Windows

🔴  Enable debug logging in config by setting logLevel to DEBUG and enable logging to a file by setting logToFile to True:

logLevel: DEBUG
logToFile: True
logDirectory: C:\\Windows\\Temp

Diskover-Web


Diskover-Web for Linux

🔴  To view log files associated with Diskover-Web errors:

tail -f /var/log/nginx/error.log

Diskover-Web for Windows

🔴  To view log files associated with Diskover-Web errors:

C:\Program Files\Nginx\nginx-1.19.6\logs\error.log

Diskover-Web Task Management

🔴  To get started with the task panel, check that you have the json files in diskover-web public/tasks/ directory.

cd /var/www/diskover-web/public/tasks
ls *.json
tasklog.json    tasks.json  templates.json  workers.json

You should see the above .json files which are used by the Task Panel for storing task and worker related data.

🔴  If you only see .json.sample files, copy the sample/default files:

for f in *.json.sample; do cp $f "${f%.*}"; done
chmod 660 *.json
chown nginx:nginx *.json

Diskover-Web Tasks Not Running or Failing

  • You will need to start at least one Diskover worker daemon diskoverd to work on tasks.
  • diskoverd can run on the diskover host or on any host.
  • diskoverd requires access to the Diskover-Web REST API which is located at http://<diskover-web-host>:<port>/api.php

If tasks are not running when scheduled or are showing last status as failed, follow the below steps to help troubleshoot.

🔴   Check Workers tab in Task Panel has at least one worker online.

🔴   For index tasks, check the mount being scanned is still mounted on the indexing host. You can use for example mount or df commands to check.

🔴   Check diskoverd worker log files for any errors or warnings. You can find the log file location by checking the diskoverd config logDirectory setting. diskoverd config file is at ~/.config/diskoverd/config.yaml.


Diskover-Web Tasks Running a Long Time

If tasks are running a lot longer than usual or expected, it could be from an error in one of the task child processes that did not exit and is still running. Follow the below steps to help troubleshoot.

🔴   Check diskoverd worker log files for any errors or warnings. You can find the log file location by checking the diskoverd config logDirectory setting. diskoverd config file is at ~/.config/diskoverd/config.yaml.

🔴   For index tasks, check the mount being scanned is still mounted on the indexing host. You can use for example mount or df commands to check.

🔴   If it is an indexing task that is taking a long time, check on the indexing host for any diskover.py processes that are running a long time and kill the process. First try to stop or force stop the task using the task drop down button. If that still does not stop the task after a minute, kill the task manually.

ps -ef | grep diskover.py
kill <pid>

Note: Any diskover.py index scan process that was killed will leave a corrupt index that should be deleted.

🔴   Check after that the task shows last status as Failed. If it shows that it is still Running, you may need to reset the task status by clicking the task drop down button and clicking Reset Status.


Unable to Access Diskover-Web from Browser

🔴  Ensure the web-server components are running:

systemctl status nginx
systemctl status php-fpm

🔴  Check the nginx web-server error logs:

tail -f /var/log/nginx/error.log

🔴  Trace access from Web session by reviewing NGINX access logs. Open Web browser and attempt to access diskover-web, the access attempt should be evident in the access log:

tail -f /var/log/nginx/access.log

Hard Reload After a Software Update

Sometimes when the web user interface gets updated, it requires a forced/hard reload of the browser, in order to reload the new Javascript, etc. cached files.

Click here for more information.

For example, this overlay display of the volumes and directories might happen after a software update and a hard reload might be necessary:


Missing Indicies

By default Diskover-web does not load all indices in Elasticsearch. This is for performance reasons in case there are thousands of indices in Elasticsearch.

🔴  First check that there are not any missing indices in Elasticsearch. To see all diskover indices in Elasticsearch:

curl -X GET "htt://<eshost>:9200/_cat/indices/diskover-*?v=true&s=index&pretty"

On AWS ES/OpenSearch:

curl -X GET -u user:pass "http://<aws es endpoint>/_cat/indices/diskover-*?v=true&s=index&pretty"

On the indices page, there is a max indices to load input setting which controls the number of indices to load. Indices are loaded by order of creation date. If you are missing indices in the list, try increasing this number. This is a per user setting that gets stored in a cookie in each user's browser.

This number can also be set for all users in web config's MAX_INDEX setting. If the user's browser maxindex cookie is lower than this number, their cookie will be set to this number.


Nginx Reverse Proxy/ Nginx too big header error/ Bad gateway 502 error

If you are running nginx reverse proxy and see in your nginx error log "upstream sent too big header while reading response header from upstream" and you are seeing bad gateway 502 errors, you will need to adjust your nginx buffer sizes:

🔴  On Nginx reverse proxy host in nginx config file:

http {
  proxy_buffer_size   128k;
  proxy_buffers   4 256k;
  proxy_busy_buffers_size   256k;
}

🔴  After making changes, you will need to restart/reload nginx service.

More info in the nginx documentation


Error 500 Upon Login

🔴  Double-check your permissions are set properly for RHEL / CentOS:

chmod 660 /var/www/diskover-web/public/*.txt

chmod 660 /var/www/diskover-web/public/tasks/*.json

chown -R root:nginx /var/lib/php

chown -R nginx:nginx /var/lib/php/session

Diskoverd Task Workers


  • You will need to start at least one Diskover worker daemon diskoverd to work on tasks.
  • diskoverd can run on the diskover host or on any host.
  • diskoverd requires access to the Diskover-Web REST API which is located at http://<diskover-web-host>:<port>/api.php

Verbose Output

🔴  Enable verbose output:

python3 diskoverd.py -v -n <worker name>

Note: -n is optional, use -h for all cli options

Log Debug

🔴  Enable debug logging in config:

vi /root/.config/diskoverd/config.yaml

🔴  Set logLevel to DEBUG and enable logging to a file by setting logToFile to True:

logLevel: DEBUG
logToFile: True
logDirectory: /tmp/

🔴  Alternatively you can run diskoverd manually and redirect all stdout/stderr output to a log file:

python3 diskoverd.py ... > /var/log/diskoverd.log 2>&1

Service Control

🔴   Stop diskoverd:

systemctl stop diskoverd

🔴   Start diskoverd:

systemctl start diskoverd

🔴   Restart diskoverd:

systemctl restart diskoverd

🔴   Check diskoverd status:

systemctl status diskoverd

Stopping/Restarting Diskoverd

🔴   Before stopping or restarting diskoverd, check that any child processes, such as diskover.py indexing tasks are not running.

ps -ef | grep diskover

🔴   Kill any child process:

kill <pid>

Note: after killing a diskover.py indexing process, the index will be in a corrupt state and should be deleted.


Elasticsearch


AWS Elasticsearch Domain

The following reference page describes how to identify and solve common Amazon Elasticsearch Service (Amazon ES) issues. Consult the information in this section before contacting AWS Support.

https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html


Useful Commands

See all diskover indices in Elasticsearch:

curl -X GET "http://<eshost>:9200/_cat/indices/diskover-*?v=true&s=index&pretty"

On AWS ES/OpenSearch:

curl -X GET -u user:password "https://<aws es endpoint>/_cat/indices/diskover-*?v=true&s=index&pretty"

Check Cluster Health

curl -X GET http://elasticsearch:9200/_cat/health?v

Delete Indices

curl -X DELETE http://elasticsearch:9200/diskover-indexname

Note: Wildcards can be used to delete multiple indices.

Delete multiple indices with comma:

curl -XDELETE http://elasticsearch:9200/diskover-index1,diskover-index2,diskover-index3,diskover-index4

If using wildcards causes an ES api error, see here:

https://www.elastic.co/guide/en/elasticsearch/reference/7.17/indices-delete-index.html#delete-index-api-path-params https://www.elastic.co/guide/en/elasticsearch/reference/7.17/index-management-settings.html#action-destructive-requires-name

To Query the Elasticsearch Cluster with Login Credentials

curl -X GET -u login:password https://elasticsearch:9200/

More info for the above commands can be found here:

https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-indices.html


Possible issues with Log4j and Elasticsearch

12/20/2021 - In the past week or so, a security vulnerability related to Java and affecting Elasticsearch was exposed called Log4j. Although Diskover software does not use Java, Elasticsearch does. This issue has been patched in Elasticsearch latest version 7.16.1. Diskover has tested Elasticsearch version 7.16.1 and have found no compatibility issues.

https://www.elastic.co/guide/en/elasticsearch/reference/current/release-notes-7.16.1.html

Regarding AWS OpenSearch implementation of Elasticsearch: AWS is aware of the recently disclosed security issue affecting the open source Apache “Log4j2” utility. This utility is used by Amazon OpenSearch Service. AWS have released a service software update R20211203-P2 that contains the updated “Log4j2” utility that addresses the issue. AWS strongly recommend that you apply this software update immediately to mitigate this issue for your OpenSearch domains.

Click here to be redirected to the AWS website for more information.


Miscellaneous


Config Error

If you get any config error messages when starting diskover.py:

  • Make sure that you used spaces and not tabs.
  • Check the config for errors or missing settings from default/sample config in configs_sample folder.
  • Check that there are no errors in your config like missing values, missing commas, missing close brackets, etc.

Code Snippets Don't Work

If you are typing a line of codes instead of using copy/paste, make sure you use spaces in config files and not tabs, and that you are respecting the number of spaces.


Contact Diskover


Methods Coordinates
Phone 800-560-5853
Sales sales@diskoverdata.com
Support support@diskoverdata.com
General Inquiries info@diskoverdata.com
Website https://diskoverdata.com
Slack Join the Diskover Slack Workspace
GitHub Visit us on GitHub

© Diskover Data, Inc.