InfluxDB Consumed Available Drive Space

Technology

A friend of mine and myself run some Grafana instances to monitor various things vi Telegraf. John was having some storage space issues on his machine and noticed that in our default installations, we had never set the data retention policies beyond default which keeps them forever!

Below is John Mora’s write-up of addressing the issue, Thank you sir! I have used the process as documented and unsurprisingly it fixed my problem as well.

Problem

Influxdb used up all allocated drive space

Current Installation

Grafana, unifipoller and influxdb in docker containers on Hyper-V

Solution

View retention policies and saved shards within influxdb. Remove unwanted shards reclaiming drive space and update retention policy to remove shards at a predetermined time base. (Instead of creating a new Retention Policy, I simply Updated/Altered the default auto-generated policy created at time of influxdb installation)

Questions

  1. What is influxdb Retention Policies?

    From: https://oznetnerd.com/2017/06/11/influxdb-retention-policies-shard-groups/

    The part of InfluxDB’s data structure that describes for how long InfluxDB keeps data (duration), how many copies of those data are stored in the cluster (replication factor), and the time range covered by shard groups (shard group duration). Retention Policies (RPs) are unique per database and along with the measurement and tag set define a series.

    When you create a database, InfluxDB automatically creates a Retention Policy called autogen with an infinite duration, a replication factor set to one, and a shard group duration set to seven days. See Database Management for retention policy management.

    Summary

    RPs define for how long data is kept. The default autogen RP is set to infinite while the default shard group duration (which is part of the autogen RP), is set to seven days.

    It was at this point I found myself getting confused. What is the difference between a RP duration and a Shard Group duration? And how can you have an expiry date on data which is configured to be kept infinitely? More on this later.

  2. What is an influxdb Shard?

    From: https://oznetnerd.com/2017/06/11/influxdb-retention-policies-shard-groups/

    Shard

    A shard contains the actual encoded and compressed data, and is represented by a TSM file on disk. Every shard belongs to one and only one shard group. Multiple shards may exist in a single shard group. Each shard contains a specific set of series. All points falling on a given series in a given shard group will be stored in the same shard (TSM file) on disk.

    Shard Groups Shard groups are logical containers for shards. Shard groups are organized by time and retention policy. Every retention policy that contains data has at least one associated shard group. A given shard group contains all shards with data for the interval covered by the shard group The interval spanned by each shard group is the shard duration.

    Shard Duration

    The shard duration determines how much time each shard group spans. The specific interval is determined by the SHARD DURATION of the retention policy. See Retention Policy management for more information.

    For example, given a retention policy with SHARD DURATION set to 1w, each shard group will span a single week and contain all points with timestamps in that week.

  3. What within influxdb is using up all my drive space?

    Answer: The Shards (see above).

Solution Steps

  1. Open Console session

  2. Connect to Docker Influxdb Container Console Session:

    docker exec -it influxdb bash
    
  3. Connect to influxdb

    # influx
    Connected to http://localhost:8086 version 1.8.3
    InfluxDB shell wersion: 1.9.3
    
  4. Query influxdb for Installed Databses

    > show database;
    name: databases
    ----
    _internal
    unifipoller
    
  5. Set Database to Query

    > use unifipoller;
    Using database unifipoller
    
  6. Show Database Retention Policy(ies)

    > show retention policies;
    name    duration shardGroupDuration replicaN default
    ----    -------- ------------------ -------- -------
    autogen 0s       168h0m0s           1        true
    
  7. Show Database Shards

    > show shard groups
    
    name: shard groups
    id database    retention_policy start_time           end_time             expiry_time
    -- --------    ---------------- ----------           --------             -----------
    18 _internal   monitor          2021-04-23T00:00:00Z 2021-04-24T00:00:00Z 2021-05-01T00:00:00Z
    16 unifipoller autogen          2021-03-19T00:00:00Z 2021-04-19T00:00:00Z 2021-05-10T00:00:00Z
    19 unifipoller autogen          2021-04-19T00:00:00Z 2021-04-26T00:00:00Z 2021-05-10T00:00:00Z
    12 telegraf    autogen          2021-03-19T00:00:00Z 2021-04-19T00:00:00Z 2021-05-10T00:00:00Z
    17 telegraf    autogen          2021-04-19T00:00:00Z 2021-04-26T00:00:00Z 2021-05-10T00:00:00Z
    
  8. Update Database Retention Policy

    > ALTER RETENTION POLICY autogen ON unifipoller DURATION 2w REPLICATION 1 SHARD DURATION 1w
    > INSERT INTO autogen measure1 value=0
    
  9. Verify Updated Database Retention Policy

    > show retention policies;
    name    duration shardGroupDuration replicaN default
    ----    -------- ------------------ -------- -------
    autogen 336h0m0s 168h0m0s           1        true
    
  10. Delete Database Shards (reclaim Drive Space)

    • (Repeat for each Shard id). You can determine which ones are no longer in use by the end_time, if it is in the past and you don’t want it you can remove it.
    > drop shard 12
    

    Repeat for each Shard Id

  11. Verify Database Shard has been Deleted

    Remaining Shards after deletion of all desired Shards should appear as follows (for this example.)

    > show shard groups
    
    name: shard groups
    id database    retention_policy start_time           end_time             expiry_time
    -- --------    ---------------- ----------           --------             -----------
    18 _internal   monitor          2021-04-23T00:00:00Z 2021-04-24T00:00:00Z 2021-05-01T00:00:00Z
    19 unifipoller autogen          2021-04-19T00:00:00Z 2021-04-26T00:00:00Z 2021-05-10T00:00:00Z
    17 telegraf    autogen          2021-04-19T00:00:00Z 2021-04-26T00:00:00Z 2021-05-10T00:00:00Z
    

Summary

After performing the above steps, I reclaimed over 10GB of drive space. Updating/Altering the autogen Retention Policy will allow influxdb to delete any Shards older than two weeks, preserving drive space.