12
Things to Consider Before Committing to Self Service Logging ELK STACK COSTS

ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

Things to Consider Before Committing to Self Service Logging

ELK STACK COSTS

Page 2: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 2ELK Stack Costs |

TABLE OF CONTENTSIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

What is the ELK Stack? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

The ELKeBMWS Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Beats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Marvel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Watcher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Shield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Cluster/Tribe Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

The High Cost Of Low Cost Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Hardware Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Scaling Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Cloud Hosting Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Data Storage Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Resource Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Page 3: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 3ELK Stack Costs |

About the author, David PosinDavid has been involved in the Information Technology Industry for two decades. Fifteen years of that time was spent consulting with many companies in a wide range of industries to build solid technology stacks and robust application architectures. David has watched the Cloud and the World Wide Web grow from their infancy, and now spends every day fully entrenched in those worlds. Currently, David builds high-performance web applications and offers professional technical writing services.

About InsightOpsInsightOps is a leading SaaS-based infrastructure monitoring tool used for real-time log centralization, search, and analysis. IT and DevOps professionals use InsightOps to easily ask questions of their operational data for immediate visibility into their IT environments. InsightOps makes it easy to get insights from your operational data without building, maintaining, or supporting your own log management stack.

Page 4: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 4ELK Stack Costs |

INTRODUCTIONThe ELK Stack is the current preferred stack for do-it-yourself (DIY) logging. It is generally thought to be composed of three software packages: Elasticsearch, Logstash, and Kibana. The truth is that a successful ELK Stack implementation requires a great deal more than those three technologies. Even with the best of community support, DIY logging with the ELK Stack will have surprises and unexpected costs. This paper will point out some of the less well understood requirements of a robust DIY ELK Stack.

Page 5: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 5ELK Stack Costs |

WHAT IS THE ELK STACK?The ELK Stack, also called the Elastic Stack, starts with a combination of three separate technologies configured to work together. Each piece in the ELK Stack handles one part of the general logging equation:

• Elasticsearch - Data storage and searching• Logstash - Gathering and formatting• Kibana - Reporting and analyzing

These three technologies are a good start but do not encompass the full services required for a robust production ready logging stack. There are additional technologies needed to maintain the health and security of the stack, as well as mechanisms to collect and disseminate information.

Adding ContextThis white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside of production, Elasticsearch, Kibana, and Logstash are capable of being run on the same machine. While that is true for a development environment, running a production grade stack on only one server is not advisable.

Page 6: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 6ELK Stack Costs |

BeatsHaving Logstash run decentralized with installations on separate machines may not be ideal. In an enterprise network, it might be preferable to have a central point to process and filter log data. It is possible to install Logstash on one server and have data shipped to it.

To centralize log information in this way requires software called, Beats, on every machine being logged. Beats defines and controls the process of sending data from different log types to Logstash. Some example Beats are Packetbeat, Filebeat, and Winlogbeat. All of these are designed to ship the specific log type they are familiar with.

MarvelLike all services in a production environment, the Elastic Stack services need to be monitored. This responsibility is accomplished with a software package called Marvel. Marvel is designed to monitor and report the health of all of your Elastic Stack components. The importance of Marvel only increases as the stack grows. Clusters and Tribes (discussed below) can mean there are lots of independent components that all need to be monitored.

WatcherOne of the core responsibilities of any logging solution is to make people aware of critical events. The Elastic Stack has a tool called Watcher to provide this essential function. Watcher observes incoming log entries and sends notifications when certain events occur. Kibana can report on the event and Logstash can disperse it; for your support staff to be notified immediately of problems before they grow requires Watcher. Notifications can be sent via email and through other services based on the configuration.

ShieldSecurity is always going to be a consideration when installing a service. Shield was created to meet this need for the Elastic Stack and to centralize security amongst the different Stack components. It is recommended to use this product over non-Elastic security methods. Nginx is sometimes suggested to help limit access but as this updated blog post, “Restricting Users for Kibana with Filtered Aliases” shows, using technology outside the Elastic stack can have unexpected consequences.

THE ELKeBMWS STACKProduction environments need to be reliable and fault tolerant. Elasticsearch, Logstash, and Kibana will need to be supported by other packages. They will need monitoring and redundancy like all software packages used in production.

Page 7: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 7ELK Stack Costs |

Another important security consideration is the use of HTTP communication by default. This should be changed when moving to production. Updating the stack for HTTPS can be done by using Shield. It will require some additional configuration and the appropriate certificates.

Cluster/Tribe NodesFinally, there are scaling issues to plan for. Elasticsearch is built using clusters to help handle and distribute Elasticsearch queries around the network. Clusters are comprised of master and data nodes, and potentially, clients. Clusters can fill up with data over time, and it will be necessary to scale. As the Elasticsearch documentation states on the “Scale is Not Infinite” page,

“ Most scaling problems can be solved by adding more nodes [servers].

It’s important to prepare for adding nodes (servers) to your network as the Elasticsearch index grows.

Eventually, even clusters won’t be sufficient to store all the data Elasticsearch encompasses.

Every Elasticsearch node and/or client (master nodes, data nodes, and clients) stores information about an Elasticsearch cluster for proper routing called the cluster state. Eventually, the cluster state will be large enough to slow down performance.

When that occurs, it will be time to introduce Tribe Nodes to the network. Tribe nodes allow searching across Elasticsearch clusters.

Installing a robust and scalable production-ready Elastic Stack is more than Elasticsearch, Logstash, and Kibana. A full accounting of the services required are:

• Elasticsearch• Logstash• Kibana (with an Elasticsearch client)• Beats (per server and data-type

being logged)• Marvel• Watcher• Shield• Clusters• Tribe nodes (not initially, but eventually

over time)

A well put together Elastic Stack will require all of these pieces before it can come close to the full functionality provided by a SaaS Logging service (like InsightOps).

The ELK Stack is really the ELKeBMWSC(Tn) Stack.

Page 8: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 8ELK Stack Costs |

THE HIGH COST OF FREE SOLUTIONSLogstash, Kibana, and Elastic are free open source solutions. There is no cost to using the software in a self-hosted environment. Being open source is one of the biggest attractions of the Elastic Stack. Free is a compelling price point. Although the software is free, running it is not. There are several costs to be aware of.

Hardware CostsThe Elastic Stack is not an “install and go solution”. The number of servers required will depend on your needs. At a minimum, for any production environment, you will have to install software on three servers. Elasticsearch and Kibana will each have their own servers, plus adding Logstash to at least one host server.

There are performance reasons to consider having users connect to Kibana from a machine separate than Elasticsearch. Elasticsearch can require a lot of CPU and memory depending on the operations being run. If Kibana is sharing those resources, the result is added latency and slow performance for the user. Running Kibana on its own server is recommended by the Elastic documentation. “While Kibana isn’t terribly resource intensive, we still recommend running Kibana separate from your Elasticsearch data or master nodes.” *

Maintaining Elasticsearch performance is a careful balance between number of servers and amount of data. To realize the full capabilities of Elasticsearch it is necessary to distributes pieces of the searchable data

amongst its servers. Expect to add servers over time to keep the search performant.

Like all services, Elasticsearch will fail on occasion so planning for failover is important. The recommendation is to have a one-to-one ratio between the server and a replicated backup. Each primary server should keep a complete copy of its data on at least one replica server. In the event of a hardware failure, Elasticsearch will automatically switch to the replica. This is the recommendation of the Elasticsearch documentation as well, “It provides high availability in case a shard/node [server] fails. For this reason, it is important to note that a replica shard is never allocated on the same node [server] as the original/primary shard that it was copied from.”

This is especially important for not losing data in Kibana. If data is unreachable, Kibana can’t indicate its absence. Reports and graphs will simply be incorrect until the problem is realized and fixed.

Servers may also be required to support the various tools mentioned above. Marvel and Watcher may require their own hardware for performance and logic reasons. Centralizing

* https://www.elastic.co/guide/en/kibana/current/production.html

Page 9: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 9ELK Stack Costs |

Logstash is also recommended so there will be a need for at least one Logstash server to receive log data and to send it to Elasticsearch. Therefore, the absolute minimum number of servers is five:

• Elasticsearch primary server• Elasticsearch replica server• Kibana• Logstash• Marvel, Watcher, etc.

Scaling CostsInstalling the Elastic Stack is only the beginning. It will grow over time and will need monitoring and scaling to keep it healthy. New indexes will require new servers. Growing logs will require more disk space. Changes in your data or logging structure will require reindexing your data.

Unfortunately, there will be problems that can’t be solved by throwing more disk space or servers at Elasticsearch. Scrunch.com’s blog post, “Lessons Learned From A Year Of Elasticsearch In Production”, mentions several potential performance-affecting issues to monitor. They suggest monitoring thread pools and heap memory, both of which can cause significant performance issues if their sizes are not monitored. Marvel can help with this, as well as a regular schedule of pruning and archiving.

Costs in Lost OpportunitySetting up indexes is as much an art as it is a science. There will be a definite learning

curve that will affect the quality of the data gathered. Indexes not being configured correctly can mean important data is lost until the issue is rectified. It is important to be vigilant about what is being logged and comparing it to what should be logged.

Cloud Hosting CostsSelf-hosting will help limit cost but may not be practical or desirable. In that case, cloud hosting is the most likely option. Cloud computing will incur costs for:

• Hardware• Data stored• Data transferred between servers

Data transfer costs in particular can vary wildly. A major event or issue could cause a burst of activity that results in much higher than usual costs. Major bursts of activity could result in overage fees at best, and data loss at worst.

Data Storage CostsThe amount of space needed for data storage requires careful consideration. Elasticsearch works by storing independent indexes of data. Data can be indexed more than once depending on how the indexes are configured. Additional fields added to documents for indexing purposes can also add to data size. Furthermore, storage needs will increase over time as the data being indexed, and the indexes grow. It is best to prepare for storage requirements to increase.

Page 10: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 10ELK Stack Costs |

“ Data storage is probably the biggest cost you will experience over time.

Documentation CostsMeticulous documentation is essential for every Elastic Stack implementation. The institutional knowledge gained from building an Elastic Stack can’t be recreated.

This information will be extremely valuable for long-term maintenance and support. As your Stack matures and ages over time, it is important to keep the documentation current.

Page 11: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 11ELK Stack Costs |

CONCLUSIONThe ELK Stack is most useful when having full control over the environment is important and the needed resources are available. As illustrated here, the Elastic Stack is not a shortcut to avoid the costs of proper logging. The Elastic Stack may not have a monthly fee and may not have software licenses, but that cost is still there in the form of rigor, resources, and scaling. The decision of whether or not to use the Elastic Stack for DIY logging is not about how much it costs compared to managed services, but rather where you want to allocate your resources and funds.

Page 12: ELK STACK COSTS · Adding Context This white paper focuses on the production environment. It explores the costs and requirements for a reliable, robust, and scalable stack. Outside

| Rapid7.com 12ELK Stack Costs |

ABOUT RAPID7With Rapid7 (NASDAQ: RPD), security and IT professionals gain the clarity and confidence they need to protect against risk and drive innovation. Rapid7 analytics transform data into answers, eliminating blind spots and giving customers the insight they need to securely develop and operate today’s sophisticated IT infrastructures, networks, and applications. Rapid7 solutions include vulnerability management, penetration testing, application security, incident detection and response, SIEM and log management, and offers managed and consulting services across its portfolio. Rapid7 is trusted by more than 6,300 organizations across over 120 countries, including 39% of the Fortune 1000. To learn more about Rapid7 or get involved in our threat research, visit www.rapid7.com.