logstash常用于什么领域 logstash的架构是怎样的

2022-12-01 13:34:36
来源:时代新闻网

一:logstash概述:

简单来说logstash就是一根具备实时数据传输能力的管道,负责将数据信息从管道的输入端传输到管道的输出端;与此同时这根管道还可以让你根据自己的需求在中间加上滤网,Logstash提供里很多功能强大的滤网以满足你的各种应用场景。

logstash常用于日志系统中做日志采集设备,最常用于ELK中作为日志收集器使用

二:logstash作用:

集中、转换和存储你的数据,是一个开源的服务器端数据处理管道,可以同时从多个数据源获取数据,并对其进行转换,然后将其发送到你最喜欢的“存储

三:logstash的架构:

logstash的基本流程架构:input | filter | output 如需对数据进行额外处理,filter可省略。

3.1 Input(输入):采集各种样式,大小和相关来源数据,从各个服务器中收集数据。

数据往往以各种各样的形式,或分散或集中地存在于很多系统中。Logstash 支持各种输入选择 ,可以在同一时间从众多常用来源捕捉事件。能够以连续的流式传输方式,轻松地从您的日志、指标、Web 应用、数据存储以及各种 AWS 服务采集数据。

3.2 Filter(过滤器)

用于在将event通过output发出之前对其实现某些处理功能。grok。

grok:用于分析结构化文本数据。目前 是logstash中将非结构化数据日志数据转化为结构化的可查询数据的不二之选

[root@node1 ~]# rpm -ql logstash | grep "patterns$" grok定义模式结构化的位置。

/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns/grok-patterns

/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns/mcollective-patterns

[root@node1 ~]#

3.3 Output(输出):将我们过滤出的数据保存到那些数据库和相关存储中,。

3.4 总结:

inpust:必须,负责产生事件(Inputs generate events),常用:File、syslog、redis、beats(如:Filebeats)

filters:可选,负责数据处理与转换(filters modify them),常用:grok、mutate、drop、clone、geoip

outpus:必须,负责数据输出(outputs ship them elsewhere),常用:elasticsearch、file、graphite、statsd

四:安装logstash环境。依据rpm包进行下载。

准备了四个节点,实验备用;

节点 ip

192.168.126.128 node1

192.168.126.129 node2

192.168.126.130 node3

192.168.126.131 node4

[root@node1 ~]# rpm -ivh logstash-7.9.1.rpm

warning: logstash-7.9.1.rpm: Header V4 RSA/SHA512 Signature, key ID d88e42b4: NOKEY

Preparing... ################################# [100%]

Updating / installing...

1:logstash-1:7.9.1-1 ################################# [100%]

Using provided startup.options file: /etc/logstash/startup.options

/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/pleaserun-0.0.31/lib/pleaserun/platform/base.rb:112: warning: constant ::Fixnum is deprecated

Successfully created system startup script for Logstash

[root@node1 ~]# vim /etc/profile.d/logstash.sh 添加可执行文件路径

export PATH=$PATH:/usr/share/logstash/bin

[root@node1 ~]# source /etc/profile.d/logstash.sh

[root@node1 ~]# java -version 基于logstash是jruby语言编写,即需要java环境。

openjdk version "1.8.0_262"

OpenJDK Runtime Environment (build 1.8.0_262-b10)

OpenJDK 64-Bit Server VM (build 25.262-b10, mixed mode)

五:logstash的工作流程。

input { 从哪个地方读取,输入数据。

filter { 依据grok模式对数据进行分析结构化

output { 将分析好的数据输出存储到哪些地方

实例一:我们以标准输入,来输出数据。

[root@node1 ~]# cd /etc/logstash/conf.d/ 默认logstash的配制文件在这个目录下

[root@node1 conf.d]# ls

[root@node1 conf.d]# vim shil.conf

input {

stdin { 标准输入

output {

stdout { 标准输入

codec => rubydebug 编码格式ruby

[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf --config.debug 使用--config.debug进行验证配置是否有错误

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console

[WARN ] 2020-10-13 13:11:00.251 [main] runner - --config.debug was specified, but log.level was not set to 'debug'! No config info will be logged.

[INFO ] 2020-10-13 13:11:00.261 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}

[INFO ] 2020-10-13 13:11:00.319 [main] writabledirectory - Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}

[INFO ] 2020-10-13 13:11:00.340 [main] writabledirectory - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}

[WARN ] 2020-10-13 13:11:00.803 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified

[INFO ] 2020-10-13 13:11:00.845 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"593d27c7-7f01-4bbc-a68c-e60c555d2f73", :path=>"/usr/share/logstash/data/uuid"}

[INFO ] 2020-10-13 13:11:02.670 [Converge PipelineAction::Create

] Reflections - Reflections took 44 ms to scan 1 urls, producing 22 keys and 45 values

[INFO ] 2020-10-13 13:11:03.627 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#"}

[INFO ] 2020-10-13 13:11:04.503 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.87}

[INFO ] 2020-10-13 13:11:04.567 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}

The stdin plugin is now waiting for input:

[INFO ] 2020-10-13 13:11:04.682 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2020-10-13 13:11:04.935 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf --config.debug

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console

[WARN ] 2020-10-13 13:11:00.251 [main] runner - --config.debug was specified, but log.level was not set to 'debug'! No config info will be logged.

[INFO ] 2020-10-13 13:11:00.261 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}

[INFO ] 2020-10-13 13:11:00.319 [main] writabledirectory - Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}

[INFO ] 2020-10-13 13:11:00.340 [main] writabledirectory - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}

[WARN ] 2020-10-13 13:11:00.803 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified

[INFO ] 2020-10-13 13:11:00.845 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"593d27c7-7f01-4bbc-a68c-e60c555d2f73", :path=>"/usr/share/logstash/data/uuid"}

[INFO ] 2020-10-13 13:11:02.670 [Converge PipelineAction::Create

] Reflections - Reflections took 44 ms to scan 1 urls, producing 22 keys and 45 values

[INFO ] 2020-10-13 13:11:03.627 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#"}

[INFO ] 2020-10-13 13:11:04.503 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.87}

[INFO ] 2020-10-13 13:11:04.567 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}

The stdin plugin is now waiting for input:

[INFO ] 2020-10-13 13:11:04.682 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2020-10-13 13:11:04.935 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

^C[WARN ] 2020-10-13 13:16:23.243 [SIGINT handler] runner - SIGINT received. Shutting down.

[INFO ] 2020-10-13 13:16:24.389 [Converge PipelineAction::Stop

] javapipeline - Pipeline terminated {"pipeline.id"=>"main"}

^C[FATAL] 2020-10-13 13:16:24.429 [SIGINT handler] runner - SIGINT received. Terminating immediately..

[ERROR] 2020-10-13 13:16:24.490 [LogStash::Runner] Logstash - org.jruby.exceptions.ThreadKill

[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/shil.conf

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console

[INFO ] 2020-10-13 13:16:44.536 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}

[WARN ] 2020-10-13 13:16:44.920 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified

[INFO ] 2020-10-13 13:16:46.360 [Converge PipelineAction::Create

] Reflections - Reflections took 37 ms to scan 1 urls, producing 22 keys and 45 values

[INFO ] 2020-10-13 13:16:47.108 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/shil.conf"], :thread=>"#"}

[INFO ] 2020-10-13 13:16:47.839 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.73}

[INFO ] 2020-10-13 13:16:47.900 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}

The stdin plugin is now waiting for input:

[INFO ] 2020-10-13 13:16:48.010 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2020-10-13 13:16:48.227 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

hello world 我们输入一些字段。

{

"host" => "node1", 当前主机

"message" => "hello world", 发布的消息

"@version" => "1", 版本号

"@timestamp" => 2020-10-13T06:08:07.476Z

实例二:我们通过grok来对日志进行分析,读取,标准输出。

2.1我们自定义gork模式对日志进行过滤。

语法格式:

%{SYNTAX:SEMANTIC}

SYNTAX:预定义模式名称;

SEMANTIC:匹配到的文本的自定义标识符;

[root@node1 conf.d]# vim groksimple.conf

input {

stdin {}

filter {

grok {

match => { "message" => "%{IP:clientip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" }

output {

stdout {

codec => rubydebug

[root@node1 conf.d]# logstash -f /etc/logstash/conf.d/groksimple.conf

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console

[INFO ] 2020-10-13 14:29:41.936 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}

[WARN ] 2020-10-13 14:29:42.412 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified

[INFO ] 2020-10-13 14:29:44.025 [Converge PipelineAction::Create

] Reflections - Reflections took 42 ms to scan 1 urls, producing 22 keys and 45 values

[INFO ] 2020-10-13 14:29:44.995 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/groksimple.conf"], :thread=>"#"}

[INFO ] 2020-10-13 14:29:45.749 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.74}

[INFO ] 2020-10-13 14:29:45.820 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}

The stdin plugin is now waiting for input:

[INFO ] 2020-10-13 14:29:45.902 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2020-10-13 14:29:46.098 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

1.1.1.1 get /index.html 30 0.23 我们标准输入一些日志信息。

{

"@timestamp" => 2020-10-13T06:30:11.973Z,

"host" => "node1",

"@version" => "1",

"request" => "/index.html",

"message" => "1.1.1.1 get /index.html 30 0.23",

"duration" => "0.23",

"clientip" => "1.1.1.1",

"method" => "get",

"bytes" => "30"

实例三:将一些webserver服务器产生的日志进行过滤标准输出

例如:apache产生的日志,在grok中有特定的过滤apache日志的结构。

[root@node1 conf.d]# vim httpdsimple.conf

input {

file { 从哪个文件中获取

path => ["/var/log/httpd/access_log"] 文件路径

type => "apachelog" 文件类型

start_position => "beginning" 从最开始取数据

filter {

grok { 过滤分析格式

match => {"message" => "%{COMBINEDAPACHELOG}"} 过滤httpd日志格式。

output {

stdout {

codec => rubydebug

[root@node4 conf.d]# logstash -f /etc/logstash/conf.d/httpd.conf --path.data=/tmp

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console

[INFO ] 2020-10-13 17:57:53.743 [main] runner - Starting Logstash {"logstash.version"=>"7.9.1", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.262-b10 on 1.8.0_262-b10 +indy +jit [linux-x86_64]"}

[INFO ] 2020-10-13 17:57:53.803 [main] writabledirectory - Creating directory {:setting=>"path.queue", :path=>"/tmp/queue"}

[INFO ] 2020-10-13 17:57:53.819 [main] writabledirectory - Creating directory {:setting=>"path.dead_letter_queue", :path=>"/tmp/dead_letter_queue"}

[WARN ] 2020-10-13 17:57:54.315 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified

[INFO ] 2020-10-13 17:57:54.352 [LogStash::Runner] agent - No persistent UUID file found. Generating new UUID {:uuid=>"99ab593e-1436-49c0-874a-e815644cb316", :path=>"/tmp/uuid"}

[INFO ] 2020-10-13 17:57:56.722 [Converge PipelineAction::Create

] Reflections - Reflections took 51 ms to scan 1 urls, producing 22 keys and 45 values

[INFO ] 2020-10-13 17:57:58.642 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/httpd.conf"], :thread=>"#"}

[INFO ] 2020-10-13 17:57:59.663 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>1.01}

[INFO ] 2020-10-13 17:58:00.166 [[main]-pipeline-manager] file - No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/tmp/plugins/inputs/file/.sincedb_15940cad53dd1d99808eeaecd6f6ad3f", :path=>["/var/log/httpd/access_log"]}

[INFO ] 2020-10-13 17:58:00.198 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}

[INFO ] 2020-10-13 17:58:00.316 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2020-10-13 17:58:00.356 [[main]

[INFO ] 2020-10-13 17:58:00.852 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9603}

在浏览器直接访问10.5.100.183

{

"@timestamp" => 2020-10-13T10:01:02.347Z,

"message" => "- - - [13/Oct/2020:18:01:01 +0800] \"GET / HTTP/1.1\" 304 - \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36\"",

"@version" => "1",

"tags" => [[0] "_grokparsefailure"],

"path" => "/var/log/httpd/access_log",

"type" => "apachelog",

"host" => "node4"

"@timestamp" => 2020-10-13T10:01:02.407Z,

"message" => "- - - [13/Oct/2020:18:01:01 +0800] \"GET /favicon.ico HTTP/1.1\" 404 209 \"http://10.5.100.183/\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36\"",

"@version" => "1",

"tags" => [[0] "_grokparsefailure"],

"path" => "/var/log/httpd/access_log",

"type" => "apachelog",

"host" => "node4"

logstash的工作流程我们完整的走了一遍,从input—>filter—>output依次进行。更多内容在下篇如何实现

读取httpd日志—>存储在redis—>读取redis数据—>存储在elasticsearch中。

关键词: logstash概述 日志采集设备 日志收集器 实时数据传输 logstash的工作流程 logstash作用 logstash的架构 Filter过滤器 logstash 安装环境

[责任编辑:]

为您推荐

时评

内容举报联系邮箱:58 55 97 3 @qq.com

沪ICP备2022005074号-27 营业执照公示信息

Copyright © 2010-2020  看点时报 版权所有,未经许可不得转载使用,违者必究。