Not seeing data from new log source

Modified on Mon, 30 Jun at 12:12 PM

When you are not seeing data from a new source which you are expecting to see within Gravwell, there are several steps that you can take to troubleshoot and resolve the issue.

Verify that you see the ingester exists in the Web UI's Ingester Health and status.
Verify that it is showing as currently connected.
Check the tag you are attempting to ingest is seen.
Try searching for the tag, setting the timeframe to "Preview"

Validating Ingester is connected

When bringing on a new ingester with your data, it's not uncommon to have issues with the ingester not properly connecting to the indexer. In these cases you may not see the new ingester listed in the Health and status page as it has never signed in and registered with the Indexer. For pre-existing ingesters, you may see the ingester in a disconnected state if for example you added a new listener that prevented the process from successfully restarting.

Verify if the ingester is in a running state:
example:
```
systemctl status gravwell_simple_relay
```
If the ingester is showing in a failed state, you can check the crash logs in /opt/gravwell/log/crash which will often tell you what caused the process to crash on startup. Common causes include syntax errors in the config file, such as unclosed brackets or quotes, and problems with a listener binding to the port.

If the ingester is showing in a running state, the ingester logs in /opt/gravwell/log/ can help identify possible issues with the connection to the indexer. Common causes include DNS lookup failures for indexer hostname; network connectivity issues connecting to the indexer (such as firewall port blocks); or an incorrect Ingest Secret that is preventing successful authentication with the indexer.

Validating that the Tag is seen

When an ingester connects to the indexer, it will generally inform the indexer of all the tags it is configured to ingest. The exceptions to this may be tags that are set via preprocessors, and Federators which are passing through tags from upstream ingesters.

If you do not see the tag listed in the Ingester configuration, or you get an "unknown tag" error when doing a tag=myTag search, It may be worth checking the configuration on the ingester to ensure the tag is properly configured (and no typos), and that the ingester process itself has been restarted since the configuration has been updated so it will read the new listener configuration.

In the case of tags set via a preprocessor, Validate the preprocessor configuration to make sure it is configured correctly. You may also want to try searching for the Listener's default tag, or other possible tags set by preprocessors on the listener, in case your data is being ingested and simply not routing as expected. (ie. if your data matches multiple routing conditions set in preprocessors on the listener, it may be getting tagged based off a different matching condition than you expect)

Searching for the data using the Preview Timeframe

When ingesting data, Gravwell will attempt to determine the correct time to attach to the entry to the best of it's ability. Often this will be accomplished by looking for a timestamp in the message payload itself and using that to determine the correct Timestamp to apply within our Time Series Database. By default, Gravwell's Query studio will search for data over the past hour. This can mean that if the data is being ingested, but is being applied a timestamp that is outside of that past hour, then you may not being seeing what you are expected to see when validating your new data source is being ingested.

The "Preview" timeframe within Query studio will essentially ignore the standard Timestamp limitation on the search and will give you a random sampling of the data that matches the query. This can help you identify if the data is possibly being ingested, but with an incorrect or unexpected timestamp. (Alternatively, you can extend the search window to a longer timeframe instead of using the Preview function).

Common causes of data appearing in extended search timeframes include:

The data source sending historical data and not "current time frame" data as expected
A Timezone Mismatch (ie. The timezone is not properly defined in the source data so the ingester makes an incorrect offset assumption when converting to UTC for ingest)
The time/date on the source device is incorrect
Multiple timestamps within the message, and the wrong one being used by the ingester when applying the TIMESTAMP value

Tricks for helping to locate data within Gravwell

When attempting to identify if data is successfully being ingested into Gravwell from new sources, especially on existing tags, It can sometimes be difficult to locate the data from the new source within all the noise. There are several search modules and methods which can be helpful in locating the expected data.

src -> This module can be used to show data originating from a specific source IP. src == <sourceIP> (If the origin device is behind a NAT Device, use the NAT'd IP instead of the system's IP)
grep / words -> These modules can be used to search for specific information within the data, such as a hostname or IP within a syslog header.

If you are not 100% positive that the data is being ingested into the correct tag, it is possible to search multiple tags within a single search. This, in combination with the above filter modules and the Table render module, can be useful when needing to identify both the data existing in the system, and which tag it is under.

Example:

tag=syslog,syslog-linux,syslog-switch,syslog-device*  src == 192.168.39.242 | table TAG DATA

NOTE: Gravwell supports wildcards within the tag field. While "tag=*" will work, it is not recommended as it can be a very expensive brute force search that will attempt to reach all the data on your system within the specified timeframe. If a wildcard search is required, we highly recommend using as much filtering as possible and utilizing acceleration when possible to minimize the load put on the system and decrease the total search time required for your results.