ELK is charging in !

Lister Digital

Aug-6, 2020

What and Why?

ELK is a powerful log management system and its a collection of multiple open source products like Elasticsearch, Logstash, Kibana, Curator, elastalert etc. It is designed to allow users to take data from any source, in any format, and to search, analyze, and visualize that data in real time. It provides a centralized logging that will be useful when attempting to identify problems with servers or applications. It allows us to search all our logs in a consolidated single portal.

Modern log management and analysis solutions include the following key capabilities:

  • Aggregation – the ability to collect and ship logs from multiple data sources.
  • Processing – the ability to transform log messages into meaningful data for easier analysis.
  • Storage – the ability to store data for extended time periods to allow for monitoring, trend analysis, and security use cases.
  • Analysis – the ability to dissect the data by querying it and creating visualizations and dashboards on top of it.

ELK modules

  • Elasticsearch - Elasticsearch is an open source distributed, RESTful, JSON-based search and analytics engine.
  • Curator - Curator manages ES indices and snapshots(log backups). This is mainly used to take data backup from ES.
  • Nginx - Open Source ELK doesn't support any authentication/authorisation system. So we are adding Nginx to act as a reverse proxy for kibana to provide basic http authentication.
  • Kibana - Kibana is an open-source data visualization and exploration tool used for log and time-series analytics and application monitoring.
  • Elastalert - It is a simple framework for alerting on anomalies, spikes or other patterns of data from ES.
  • Filebeat - Filebeat is a lightweight shipper for forwarding and centralizing log data.

Elasticsearch Cluster and Nodes

An Elasticsearch cluster is a group of one or more Elasticsearch nodes instances that are connected together. The power of an Elasticsearch cluster lies in the distribution of tasks, searching and indexing, across all the nodes in the cluster.

Nodes

The nodes can be assigned different jobs or responsibilities:

  • Data nodes — stores data and executes data-related operations such as search and aggregation
  • Master nodes — in charge of cluster-wide management and configuration actions such as adding and removing nodes
  • Client nodes — forwards cluster requests to the master node and data-related requests to data nodes
  • Ingest nodes — for pre-processing documents before indexing

Cluster

A cluster consists of one or more nodes which share the same cluster name. Each cluster has a single master node which is chosen automatically by the cluster and which can be replaced if the current master node fails.

The main functionalities used in ES clusters are Document, Index and Shards

  • Documents
    A document is a JSON document which is stored in Elasticsearch. It is analogous to a row in a table in a relational database.
  • Index
    An index is like a table in a relational database. It is a logical namespace which maps to one or more primary shards and can have zero or more replica shards.
  • Primary and Replica Shards
    Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard.
    Shards are mainly classified into two, Primary shards and Replica shard.
  • Replica shard
    A replica is a copy of the primary shard, and has two purposes:
    • Increase failover:
      a replica shard can be promoted to a primary shard if the primary fails
    • Increase performance:
      get and search requests can be handled by primary or replica shards. By default, each primary shard has one replica, but the number of replicas can be changed dynamically on an existing index. A replica shard will never be started on the same node as its primary shard.

Dockerized ELK architecture

Below is the architecture with a configuration using docker on a single server with dedicated data disks for each ES node.

This architecture consists of 3 elasticsearch nodes, curator, Nginx and kibana. It is a three node cluster. Each source machine will be configured with a filebeat agent which is used to ship the logs to the ES nodes.


Kibana - Visualizer

Kibana is an open source browser based visualization tool mainly used to analyze large volumes of logs in the form of line graph, bar graph, pie charts, heat maps, region maps, coordinate maps, gauge, goals, timelion etc. The visualization makes it easy to predict or to see the changes in trends of errors or other significant events of the input source.

Advantages of Kibana

Kibana offers the following advantages to its users −

  • Contains open source browser based visualization tools mainly used to analyse large volumes of logs in the form of line graph, bar graph, pie charts, heat maps etc.
  • Simple and easy for beginners to understand.
  • Ease of conversion - visualization and dashboard into reports.
  • Canvas visualization helps to analyse complex data in an easy way.
  • Timelion visualization in Kibana helps to compare data backwards to understand the performance better.

Presenting the Kibana home page.

Defining an index pattern

The next step is to define a new index pattern, or in other words, tell Kibana what Elasticsearch index to analyze. After a new index is created in ES, the pattern of which can be defined in Kibana.

In Kibana, go to Management → Kibana Index Patterns, and Kibana will automatically identify the new index pattern Ex: “logstash-*” or “filebeat-*” or “metricbeat-*” etc.

Define it as “logstash-*”/"filebeat-*"/"metricbeat-*", and in the next step select @timestamp as your Time Filter field.

Hit Create index pattern, and you are ready to analyze the data. This is how an index pattern page looks,

Go to the Discover tab in Kibana to take a look at the data

Kibana Queries

Kibana querying is an art unto itself, and there are various methods for performing searches on your data. To search the indices that match the current index pattern, enter the search criteria in the query bar. By default, we use Kibana’s standard query language (KQL), which features autocomplete and a simple, easy-to-use syntax. If we prefer to use Kibana’s legacy query language, based on the Lucene query syntax, you can switch to it from the KQL popup in the query bar. When you enable the legacy query language, you can use the full JSON-based Elasticsearch Query DSL.

  • String queries: A query may consist of one or more words or a phrase. A phrase is a group of words surrounded by double quotation marks, such as "test search".
  • Field-based queries: Kibana allows you to search specific fields.
  • Regexp queries: Kibana supports regular expression for filters and expressions
  • Range queries: Range queries allow a field to have values between the lower and upper bounds. The interval can include or exclude the bounds depending on the type of brackets that you use. Ex: To search for slow transactions with a response time greater than 10ms the query would look like :- responsetime: {10000000 TO *}
  • Boolean queries: Boolean operators (AND, OR, NOT) allow combining multiple sub-queries through logic operators. Note: Operators such as AND, OR, and NOT must be capitalized.

Kibana Filters

In Kibana, we can also filter transactions by clicking on elements within a visualization. For example, to filter for all the HTTP redirects that are coming from a specific IP and port, click the Filter for value icon next to the client.ip and client.port fields in the transaction detail table. To exclude the HTTP redirects coming from the IP and port, click the Filter out value icon instead.

The selected filters appear under the search box,

Saving the search

We have an option to save the queried searches in the portal.
After clicking the save option we can give a name to the new save search. Once it is saved, we can open the specific search from open whenever required.

Create Dashboard

To create a dashboard, we must have data indexed into Elasticsearch, an index pattern to retrieve the data from Elasticsearch, and visualizations, saved searches, or maps. If these don’t exist, you will be prompted to add them as you create the dashboard, or you can add sample data sets, which include pre-built dashboards.

Click on Visualize and you will see list of options and graphs for creating different visualization

Ex: Lets select Vertical Bar and select the search query that you want to visualize. Then you will be routed to the graphical page where you can set the metrics on the defined pattern. Here’s a sample vertical bar chart.

Now we can create a dashboard from the visualize.
Click Dashboard → Create New Dashboard → Add. Now select the visualize from Add Panels, so that it will be in the new dashboard.

Contact Us