Simplified Logging and Analysis with Parseable

0
585

Parseable is a free and open source log storage and analysis platform. It is written in the Rust programming language. This article is a quick introduction to the topic.

Written in Rust, Parseable leverages the advances in data compression, storage and networking to bring you a simple but efficient platform.

But before we get into Parseable, let’s understand why you should even care about logging.

Why logging?

For us developers, sometimes it is not very clear as to why so much fuss is made about logging. We debug the code, find out the issue, fix it and move on. Why does a log even have to be stored and persisted for a long term?

Log is a loaded term used for a large variety of data. But at a high level, any stream of events that represents the current state of a software or hardware system can be called a log. You can already see the horizon widening now. Consider these scenarios:

  • SaaS companies are answerable for any downtime, data access, security breaches, data loss, and so on. In such cases, it makes sense to retain every bit of log data, so that you can not only avoid issues, but also fix them properly if something happens.
    Consumer apps thrive on user behaviour, access patterns and other data. All these are logged and passed through various ML pipelines to generate better responses, notifications, and nudges.
  • IoT devices are growing and so is the data generated from these devices. Crunching this data for predictive maintenance and analysis means all this data has to be stored for the long term first.

This is just a small list of possible scenarios where log data is critical for business impact. Now that the need for logging and retention of data for long durations is established, let’s take a deeper look at Parseable.

What is Parseable?

Parseable is a free and open source, log storage and analysis platform, released under the AGPL-3.0 licence. Parseable repo is available on GitHub at https://github.com/parseablehq/parseable.

The idea behind Parseable is to rethink logging; it’s a modern, blazing fast, cloud native (stateless, easily deployable across cloud providers) platform that is ridiculously easy to deploy and use. Some core concepts on which Parseable is built are listed below.

No need for indexing: Traditionally, text search engines like Elastic, etc, have doubled up as log storage platforms. This made sense because log data had to be searched for it to be really useful. But this came at a high cost – indexing is CPU-intensive and slows down the ingestion. Additionally, most of these indexing systems generate index data in the same order of the raw log data. This doubles the storage cost and increases complexity. Parseable changes this. With columnar data formats (parquet), it is possible to compress and query the log data efficiently without indexing it.

Ownership of both data and content: With parquet as the storage format, stored on standard object storage buckets, users not only own their log data but also have unabridged access to the actual content. This means they can simply point their analysis tools like Spark, Presto or TensorFlow to extract more value out of the data. This is an extremely powerful feature, opening up new avenues of data analysis.

Fluid schema: Logs are generally semi-structured by nature, and they are ever evolving, e.g., a developer may start with a log schema like this:

```
{
“Status”: “Ready”,
“Application”: “Facebook”
}
```

But as more information gets collected, the log schema may evolve to:

```
{
“Status”: “Ready”,
“Application”: {
“UserID”: “asdadaferssda”,
“Name”: “Facebook”
}
}
```

Engineering and SRE teams regularly face schema related issues. Parseable solves this with a fluid schema approach that lets users change the schema on the fly.

SDK-less ingestion: The current ingestion mechanism to logging platforms is quite convoluted, with several protocols and connectors floating around. Parseable aims to make log ingestion as easy as possible. This means you can simply use HTTP POST calls to send logs to Parseable; no complicated SDKs are required. Alternatively, if you want to use a logging agent like FluentBit, Vector, LogStash, etc, almost all the major log collectors already have support for HTTP; hence, Parseable is already compatible with your favourite log collection agent.

Getting started

Let’s see how to get started with Parseable now. We’ll use the Docker image to try out Parseable.

We’ll use Parseable with a publicly accessible object storage, just to help you experience it. Here is the command:

cat << EOF > parseable-env
P_S3_URL=https://minio.parseable.io:9000
P_S3_ACCESS_KEY=minioadmin
P_S3_SECRET_KEY=minioadmin
P_S3_REGION=us-east-1
P_S3_BUCKET=parseable
P_LOCAL_STORAGE=/data
P_USERNAME=parseable
P_PASSWORD=parseable
EOF

mkdir -p /tmp/data
docker run \
-p 8000:8000 \
--env-file parseable-env \
-v /tmp/data:/data \
parseable/parseable:latest

You can now log in to the Parseable UI using the credentials we passed here, i.e., ‘parseable’ , ‘parseable’. This will have some data, because Parseable is pointing to the publicly open bucket.

Note that this is a public object storage bucket, which we are using for demo purposes only. Make sure to change the bucket and credentials to your object storage instance before sending any data to Parseable.

To get a deeper understanding of how Parseable works and how to ingest logs, please refer to the documentation available at https://www.parseable.io/docs/introduction.
I hope this article helped you get better insights into the logging ecosystem, and that you find Parseable an interesting project to try out and, hopefully, join the community.

LEAVE A REPLY

Please enter your comment!
Please enter your name here