Skip to main content

Graylog is restarting stuck with disk full

Graylog is restarting...
There is no Graylog web application running at the moment, please reload this page in a minute. It can take up to 1-2 minutes until all services are running properly. In case this is a permanent error, check the following:

Check if all services are running - sudo graylog-ctl status shows an overview of all running services
Check for errors in log files - Relevant services write log files here: /var/log/graylog/*/current
Ask for help - If there is no way to fix the issue ask for help:


I got this error on my Gray-log server, upon troubleshooting I found that the disk was 100% full and was unable to start elastic search mongodb and etcd while checking gray-log server status with command
#graylog-ctl status

Solution to this problem was obvious that I have to clean some disk space to get gray-log working again but what file should I delete was my next thought!

Upon googling I found that I could safely delete the old log files of elastic search to free up the space.

So I stopped gray-log server with

$sudo graylog-ctl stop

My gray-log installation path for elasticsearch logs was at

root@graylog:/var/opt/graylog/data/elasticsearch/graylog/nodes/0/indices#

Listed the files at this path

root@graylog:/var/opt/graylog/data/elasticsearch/graylog/nodes/0/indices# ls -al

drwx------ 7 graylog graylog 4096 Aug 12  2016 graylog_0
drwx------ 7 graylog graylog 4096 Aug  3  2017 graylog_1

I deleted one old log folder "graylog_0" which had consumed disk space of around 5 GB inside it.

root@graylog:/var/opt/graylog/data/elasticsearch/graylog/nodes/0/indices# rm -R graylog_0/

After deleting the log folder I restarted the graylog server

root@graylog:~# graylog-ctl start

Now I can access graylog server, all my configuration and dashboards are in place and working good. But I am getting an error for etcd (for clustering of node) of database corruption, a type of file "wal" is not accessible.

Since this is the only of my node and not a cluster configuration, I deleted the etcd folder and reconfigured the graylog server.

Delete the etcd folder here

root@graylog:~#/var/opt/graylog/data/rm -R etcd

root@graylog:~#/var/opt/graylog/data/graylog-ctl reconfigure

Now i can see the working status of all service with graylog as below

root@graylog:/var/opt/graylog/data/etcd/member# graylog-ctl status
run: elasticsearch: (pid 4437) 21s; run: log: (pid 876) 1059s
run: etcd: (pid 4272) 25s; run: log: (pid 891) 1059s
run: graylog-server: (pid 4490) 20s; run: log: (pid 857) 1059s
run: mongodb: (pid 4314) 23s; run: log: (pid 890) 1059s
run: nginx: (pid 4515) 20s; run: log: (pid 856) 1059s





Comments

Popular posts from this blog

Basic MPLS BGP and L3VPN Lab Setup

In this lab, we’ve set up a basic MPLS, BGP, and L3VPN environment, which is a great foundation for understanding how service providers build scalable networks. The lab uses the EVE-NG simulator along with Router IOS C7200-ADVENTERPRISEK9-M, Version 15.2(4)M8 to emulate a realistic MPLS environment. Below is a summary of the key components and roles of each router in the lab. MPLS Core Routers : The MPLS core consists of the routers responsible for label switching and forwarding customer traffic through the network: PE1 (Provider Edge 1) : Connects customer networks to the MPLS core and handles both MPLS and BGP routing. It also hosts VRF (Virtual Routing and Forwarding) instances for customers. PE2 (Provider Edge 2) : Functions similarly to PE1, connecting another customer network to the MPLS core. P1 (Core Router 1) and P2 (Core Router 2) : These routers serve as MPLS core routers and handle label switching but do not store or process customer routes directly. They simply f

OSPF Adjacency Stuck in EXSTART on Cisco IOS XR – Issue and Solution

In a recent lab setup using Cisco IOS XR on EVE-NG, I faced a common but frustrating issue with OSPF adjacencies getting stuck in the EXSTART state. After spending considerable time troubleshooting interface MTUs and configurations, I discovered that the root cause was related to the virtual network interface type being used. This post outlines the issue, troubleshooting steps, and the eventual solution that got everything working. Issue: While configuring OSPF between two routers running Cisco IOS XR in my lab, OSPF adjacencies were getting stuck in the EXSTART state. I verified that interface configurations, MTU settings, and OSPF parameters were correct, but the problem persisted. I tried adjusting the MTU size, using the mtu-ignore command, and even checked for ACLs, but nothing seemed to resolve the issue. Troubleshooting Steps: MTU Settings: I started by verifying that both sides of the OSPF adjacency had matching MTUs. I used the default MTU and even tried different values wit

How to Properly Clone an EVE-NG Lab with Configurations

Cloning labs in EVE-NG is a great way to duplicate setups and expand or experiment on a new copy without affecting the original lab. However, if not done correctly, the cloned lab may only copy the topology without configurations. In this guide, I’ll show you how to properly clone a lab in EVE-NG with all configurations using the EVE-NG GUI . Follow these steps to ensure that both the topology and router configurations are retained when cloning your lab. Steps to Clone an EVE-NG Lab with Configurations Save Running Configuration on All Devices In your original lab, make sure all devices have their configurations saved to NVRAM. Go into the CLI of each router and run the command: copy running-config startup-config Export All Configurations (CFGs) On the left sidebar in the EVE-NG Web UI , click on the "More Actions" option. Then select "Export all CFGs" . This step exports the configurations of all devices in the lab. Shutdown All Devices After exporting the confi