Backing up the Digital Hive Content Store

Backing up the Digital Hive Content Store

Question

Is there a way to backup the Digital Hive content store?

Answer

Digital Hive uses PostgreSQL as the application content store. As part of the Digital Hive platform, a daily process executes that creates an archive of the entire content store, which is then saved to the <install_directory>\PostgreSQL\postgresql_backups folder. 

By design, these backups are NOT deleted, and will eventually consume a lot of disk space. There are a couple of techniques that can be used to not only reduce the amount backups that are maintained, but can also reduce the size of the actual archive files.

To automate the deletion of unwanted PostgreSQL archive files, a batch (.bat) file can be created that can either be run manually, or through a scheduling application. To create the batch file:
  1. Open your favorite text editor
  2. Create a new text file
  3. Paste the following lines of text into the file 

    Echo delete backups older than 3 days
    forfiles -p C:\DigitalHive\PostgreSQL\postgresql_backups\ -s -m *.zip /D -3 /C "cmd /c del /q @path"
    echo Done!

  4. Change the installation path from C:\DigitalHive to what ever installation path was created for your Digital Hive installation
  5. By default, this script will only keep the last three instances of the backup. Change the 3 in the command to reflect the amount of backups that you would like to persist
  6. Save the file as delete_postgres_backups.bat
  7. Close the text editor
To run this batch file manually, navigate to the saved delete_postgres_backups.bat file and double click the file. If desired, the batch file execution can be automated by scheduling the batch file through the Windows scheduler or another server scheduling application.

If this file will be manually executed, save the .bat file to the desktop so that anyone logging into the Digital Hive server has easy access.

Over time the daily backups will increase in size. This is due to many factors such as the number of Hives increases, more users, more content, etc, which can not be controlled. There is however, an opportunity to reduce the size of the archives by eliminating some duplication and stale entries. To perform this task, you must be using Digital Hive version 2.0 FL3 or later. 
  1. Open a Command Prompt and navigate to the DigitalHive\PostgreSQL\bin directory
  2. Run this command (assuming that the defaults were used for the admin db account and db name, otherwise use what it was changed to):
    vaccumlo -v -U postgres digitalhive

  3.  Using pgAdmin, connect to the Digital Hive database and run these create trigger/procedure commands:

CREATE TRIGGER t_thumbnail_prune BEFORE UPDATE OR DELETE ON adf_content_thumbnail
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(thumbnail);
CREATE TRIGGER t_pr_emailpass_template BEFORE UPDATE OR DELETE ON ADF_EMAIL_CONFIG
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(EMAIL_PASSWORD_TEMPLATE);
CREATE TRIGGER t_pr_file_content BEFORE UPDATE OR DELETE ON THEIA_FILES
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(CONTENT);
CREATE TRIGGER t_pr_file_thumb_l BEFORE UPDATE OR DELETE ON THEIA_FILES
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(THUMBNAIL_LARGE);
CREATE TRIGGER t_pr_file_thumb_s BEFORE UPDATE OR DELETE ON THEIA_FILES
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(THUMBNAIL_SMALL);
CREATE TRIGGER t_pr_adffile_content BEFORE UPDATE OR DELETE ON ADF_CONTENT_FILE
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(CONTENT);
CREATE TRIGGER t_pr_adffile_thumb_l BEFORE UPDATE OR DELETE ON ADF_CONTENT_FILE
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(THUMBNAIL_LARGE);
CREATE TRIGGER t_pr_adffile_thumb_s BEFORE UPDATE OR DELETE ON ADF_CONTENT_FILE
FOR EACH ROW
WHEN( coalesce(current_setting('your.SKIP_PRUNE_TRIGGER', 't'),'FALSE') <> 'TRUE')
EXECUTE PROCEDURE lo_manage(THUMBNAIL_SMALL);

Running these PostgreSQL commands will reduce the size of the individual archive files. By employing both techniques of reducing the number of persisted archives, and reducing the actual file sizes, a lot of disk space can remain available to the server.
    • Related Articles

    • Adding Custom Metadata Fields to Enhance Digital Hive Content

      Question Looking at the properties for a report, there is metadata visible from the content systems like Tableau, PowerBI, and IBM Cognos Analytics, but is it possible to create custom metadata values on a report or dashboard. The objective is to ...
    • Digital Hive Architecture Overview

      Overview Digital Hive is a lightweight, web-based application that is both easy to install as well as to maintain and manage. Other than the Microsoft C++ redistributable (https://www.microsoft.com/en-ca/download/details.aspx?id=40784), there are no ...
    • Digital Hive Load Balancing

      Question Due to the number of users and the critical nature of the BI reports, it is desirable to introduce more capacity into the Digital Hive environment. Can Digital Hive be installed across multiple servers? Answer Digital Hive can be installed ...
    • Digital Hive Auditing Explained

      Question The Digital Hive Control Center doesn't seem to provide any reports or views around usage of the platform. Is there a way to create reports to show metrics like the amount of reports executed by platform, the number of searches being made, ...
    • Unable to Explore Digital Hive Content

      Question When using the Explore Content interface to browse the Digital Hive content, a message stating "Something went wrong" and a Try Again button is displayed How do I resolve this? Answer This issue is usually a result of Elasticsearch being ...