Commit 2e1b8a55 authored by Noc User ubuntu-srv-02's avatar Noc User ubuntu-srv-02
Browse files

add-docs-monitoring

parent 2f26907f
Pipeline #34930 failed with stages
in 3 minutes and 30 seconds
......@@ -38,11 +38,11 @@ forms:
type: bool
default: False # yamllint disable-line rule:truthy
enable_managedobject:
label: "enable_liftbridge"
label: "enable_managedobject"
type: bool
default: True # yamllint disable-line rule:truthy
enable_task:
label: "enable_liftbridge"
label: "enable_task"
type: bool
default: False # yamllint disable-line rule:truthy
......
# How-to Guides
* [**Deploy NOC**](./deploy-noc/deploy-noc.md)
* [**Monitoring NOC**](./monitoring/monitoring-noc.md)
# Мониторим NOC
## Устанавлием Docker и Docker-compose
1. Установка Docker https://docs.docker.com/engine/install/
2. Установка Docker Compose https://docs.docker.com/compose/install/
## Создаём docker-compose.yml со всем его окружением
1. Создаём **docker-compose.yml**
```
version: '3'
services:
grafana:
image: grafana/grafana-oss
restart: always
ports:
- 3000:3000
user: '0'
volumes:
- "./grafana/data/:/var/lib/grafana/"
- "./grafana/grafana-selfmon-dashboards/dashboards/noc/:/var/lib/grafana/dashboards"
- "./grafana/grafana-selfmon-dashboards/provisioning/datasources/:/etc/grafana/provisioning/datasources/"
- "./grafana/grafana-selfmon-dashboards/provisioning/dashboards/:/etc/grafana/provisioning/dashboards/"
networks:
- mon
vmagent:
image: victoriametrics/vmagent
depends_on:
- "vm"
ports:
- 8429:8429
volumes:
- "./vm/vmagentdata:/vmagentdata"
- "./vm/prometheus.yml:/etc/prometheus/prometheus.yml"
command:
- '--promscrape.config=/etc/prometheus/prometheus.yml'
- '--remoteWrite.url=http://vm:8428/api/v1/write'
restart: always
networks:
- mon
vm:
image: victoriametrics/victoria-metrics
ports:
- 8428:8428
volumes:
- "./vm/vmdata/:/storage"
command:
- '--storageDataPath=/storage'
- '--retentionPeriod=60d'
- '--httpListenAddr=:8428'
restart: always
networks:
- mon
alertmanager:
image: prom/alertmanager
restart: always
volumes:
- "./vm:/alertmanager"
command:
- --config.file=/alertmanager/alertmanager.yml
- --web.external-url=https://alertmanager:9093
networks:
- mon
prometheus-bot:
image: tienbm90/prometheus-bot:0.0.1
volumes:
- ./vm/telegrambot/config.yaml:/config.yaml
- ./vm/telegrambot/:/etc/telegrambot/
networks:
- mon
restart: always
vmalert:
image: victoriametrics/vmalert
depends_on:
- "vm"
- "alertmanager"
volumes:
- "./vm/noc-prometheus-alerts/:/etc/alerts/"
command:
- '--datasource.url=http://victoriametrics:8428/'
- '--remoteRead.url=http://victoriametrics:8428/'
- '--remoteWrite.url=http://victoriametrics:8428/'
- '--notifier.url=http://alertmanager:9093/'
- '--rule=/etc/alerts/*.rules.yml'
networks:
- mon
restart: always
networks:
mon:
```
2. Создаём директорию **grafana**, а в ней создаём директорию **data**
3. Переходим из директории в которой лежит **docker-compose.yml**, в директорию **grafana**
4. Делаем git clone git@code.getnoc.com:noc/grafana-selfmon-dashboards.git
5. Переходим обратно в директорию где лежит **docker-compose.yml** и создаём директорию **vm**
6. В директории **vm** создаём директорию **vmdata** и файл **prometheus.yml** со следующим содержимым:
```
# my global config
# собирать будем раз в 10 секунд
global:
scrape_interval: 10s # By default, scrape targets every 15 seconds.
evaluation_interval: 10s # By default, scrape targets every 15 seconds.
# у нас есть правила алертинга для системы
rule_files:
- noc-prometheus-alerts/*.rules.yml
# у нас есть алертменеджер живет там то
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanagers:9093
# важная секция. сбор метрик
scrape_configs:
# самомониторинг прометея
- job_name: 'vmagent'
static_configs:
- targets: ['vmagent:8429']
labels:
env: 'infrastructure'
- job_name : 'victoriametrics'
static_configs:
- targets: ['vm:8428']
# собираем метрики с ноковских демонов. ищем их через консул
- job_name: 'noc'
consul_sd_configs:
- server: '<ip-адрес сервера с ноком>:8500' # например 192.168.1.25
relabel_configs:
- source_labels: [__meta_consul_tags]
regex: .*,noc,.*
action: keep
- source_labels: [__meta_consul_service]
target_label: job
- source_labels: [env]
target_label: env
replacement: "dev" # указываем тут тип инсталляции нока
# собираем метрики с кликхауса
- job_name: 'ch'
scrape_interval: 30s
static_configs:
- targets:
- <ip-адрес сервера на котором утсановлен clickhouse>:9116 # если вы выбрали тип инсталляции при помощи docker и хотите использовать имя вместо ip-адреса, то необходимо дать доступ контейнеру на чтение файла "/etc/hosts"
labels:
env: 'dev' # указываем тут тип инсталляции нока
```
5. Делаем git clone git@code.getnoc.com:noc/noc-prometheus-alerts.git, в директорию **vm**
6. В директорию **vm** создаём файл **alertmanager.yml** со следующим содержимым подставляя свои данные:
```
global:
resolve_timeout: 5m
smtp_from: alertmanager@prometheus.example.com
smtp_smarthost: mx1.example.com:25
smtp_require_tls: false
route:
receiver: 'prometheus-bot'
routes:
- receiver: 'prometheus-bot'
group_interval: 10m
receivers:
- name: 'prometheus-bot'
webhook_configs:
- url: 'http://prometheus-bot:9087/alert/<id чат телеграмма>'
- name: email
email_configs:
- send_resolved: false
to: XXX@example.com
headers:
From: alertmanager@prometheus.example.com
Subject: '{{ template "email.default.subject" . }}'
To: XXXXXXX@example.com
html: '{{ template "email.default.html" . }}'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'instance']
```
7. Создаём директорию **telegrambot**, а в ней создаём два файла **config.yaml** и **template.tmpl** не забываю подставлять свои значения.
config.yaml:
```
telegram_token: "<token от бота в telegram>"
template_path: "/etc/telegrambot/template.tmpl"
time_zone: "Europe/Moscow"
split_token: "|"
time_outdata: "02/01/2006 15:04:05"
split_msg_byte: 10000
```
template.tmpl:
```
{{ $length := len .GroupLabels -}} {{ if ne $length 0 }}
<b>Grouped for:</b>
{{ range $key,$val := .GroupLabels -}}
{{$key}} = <code>{{$val}}</code>
{{ end -}}
{{ end -}}
{{if eq .Status "firing"}}
Status: <b>{{.Status | str_UpperCase}} 🔥</b>
{{end -}}
{{if eq .Status "resolved"}}
Status: <b>{{.Status | str_UpperCase}} ✅</b>
{{end }}
<b>Active Alert List:</b>
{{- range $val := .Alerts }}
Alert: {{ $val.Labels.alertname }}
{{if HasKey $val.Annotations "message" -}}
Message:{{ $val.Annotations.message }}
{{end -}}
{{if HasKey $val.Annotations "summary" -}}
Summary:{{ $val.Annotations.summary }}
{{end -}}
{{if HasKey $val.Annotations "description" -}}
Description:{{ $val.Annotations.description }}
{{end -}}
{{if HasKey $val.Labels "name" -}}
Name:{{ $val.Labels.name }}
{{end -}}
{{if HasKey $val.Labels "partititon" -}}
Partition:{{ $val.Labels.partition}}
{{end -}}
{{if HasKey $val.Labels "group" -}}
Group:{{ $val.Labels.group }}
{{end -}}
{{if HasKey $val.Labels "instance" -}}
Instance:{{ $val.Labels.instance }}
{{end -}}
{{if HasKey $val.Labels "queue" -}}
Queue:{{ $val.Labels.queue }}
{{end -}}
{{if HasKey $val.Labels "pool" -}}
Pool:{{ $val.Labels.pool }}
{{end -}}
{{if HasKey $val.Annotations "value" -}}
Value:{{ $val.Annotations.value }}
{{end -}}
Active from: {{ $val.StartsAt | str_FormatDate }}
{{ range $key, $value := $val.Annotations -}}
{{- end -}}
{{- end -}}
```
## Ребутаем сервисы NOC
1. Переходим в директорию где лжеит noc (/opt/noc)
2. Выполняем команду ./noc ctl restart all
License
=======
NetworkX is distributed with the 3-clause BSD license.
::
Copyright (C) 2004-2020, NetworkX Developers
Aric Hagberg <hagberg@lanl.gov>
Dan Schult <dschult@colgate.edu>
Pieter Swart <swart@lanl.gov>
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
* Neither the name of the NetworkX Developers nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
=======
Mayavi2
=======
"""
import networkx as nx
import numpy as np
from mayavi import mlab
# some graphs to try
# H=nx.krackhardt_kite_graph()
# H=nx.Graph();H.add_edge('a','b');H.add_edge('a','c');H.add_edge('a','d')
# H=nx.grid_2d_graph(4,5)
H = nx.cycle_graph(20)
# reorder nodes from 0,len(G)-1
G = nx.convert_node_labels_to_integers(H)
# 3d spring layout
pos = nx.spring_layout(G, dim=3)
# numpy array of x,y,z positions in sorted node order
xyz = np.array([pos[v] for v in sorted(G)])
# scalar colors
scalars = np.array(list(G.nodes())) + 5
pts = mlab.points3d(
xyz[:, 0],
xyz[:, 1],
xyz[:, 2],
scalars,
scale_factor=0.1,
scale_mode="none",
colormap="Blues",
resolution=20,
)
pts.mlab_source.dataset.lines = np.array(list(G.edges()))
tube = mlab.pipeline.tube(pts, tube_radius=0.01)
mlab.pipeline.surface(tube, color=(0.8, 0.8, 0.8))
mlab.show()
.. _examples_gallery:
Gallery
=======
General-purpose and introductory examples for NetworkX.
The `tutorial <../tutorial.html>`_ introduces conventions and basic graph
manipulations.
"""
===========
Eigenvalues
===========
Create an G{n,m} random graph and compute the eigenvalues.
"""
import matplotlib.pyplot as plt
import networkx as nx
import numpy.linalg
n = 1000 # 1000 nodes
m = 5000 # 5000 edges
G = nx.gnm_random_graph(n, m)
L = nx.normalized_laplacian_matrix(G)
e = numpy.linalg.eigvals(L.A)
print("Largest eigenvalue:", max(e))
print("Smallest eigenvalue:", min(e))
plt.hist(e, bins=100) # histogram with 100 bins
plt.xlim(0, 2) # eigenvalues between 0 and 2
plt.show()
"""
==================
Heavy Metal Umlaut
==================
Example using unicode strings as graph labels.
Also shows creative use of the Heavy Metal Umlaut:
https://en.wikipedia.org/wiki/Heavy_metal_umlaut
"""
import matplotlib.pyplot as plt
import networkx as nx
hd = "H" + chr(252) + "sker D" + chr(252)
mh = "Mot" + chr(246) + "rhead"
mc = "M" + chr(246) + "tley Cr" + chr(252) + "e"
st = "Sp" + chr(305) + "n" + chr(776) + "al Tap"
q = "Queensr" + chr(255) + "che"
boc = "Blue " + chr(214) + "yster Cult"
dt = "Deatht" + chr(246) + "ngue"
G = nx.Graph()
G.add_edge(hd, mh)
G.add_edge(mc, st)
G.add_edge(boc, mc)
G.add_edge(boc, dt)
G.add_edge(st, dt)
G.add_edge(q, st)
G.add_edge(dt, mh)
G.add_edge(st, mh)
# write in UTF-8 encoding
fh = open("edgelist.utf-8", "wb")
nx.write_multiline_adjlist(G, fh, delimiter="\t", encoding="utf-8")
# read and store in UTF-8
fh = open("edgelist.utf-8", "rb")
H = nx.read_multiline_adjlist(fh, delimiter="\t", encoding="utf-8")
for n in G.nodes():
if n not in H:
print(False)
print(list(G.nodes()))
pos = nx.spring_layout(G)
nx.draw(G, pos, font_size=16, with_labels=False)
for p in pos: # raise text positions
pos[p][1] += 0.07
nx.draw_networkx_labels(G, pos)
plt.show()
"""
==========================
Iterated Dynamical Systems
==========================
Digraphs from Integer-valued Iterated Functions
Sums of cubes on 3N
-------------------
The number 153 has a curious property.
Let 3N={3,6,9,12,...} be the set of positive multiples of 3. Define an
iterative process f:3N->3N as follows: for a given n, take each digit
of n (in base 10), cube it and then sum the cubes to obtain f(n).
When this process is repeated, the resulting series n, f(n), f(f(n)),...
terminate in 153 after a finite number of iterations (the process ends
because 153 = 1**3 + 5**3 + 3**3).
In the language of discrete dynamical systems, 153 is the global
attractor for the iterated map f restricted to the set 3N.
For example: take the number 108
f(108) = 1**3 + 0**3 + 8**3 = 513
and
f(513) = 5**3 + 1**3 + 3**3 = 153
So, starting at 108 we reach 153 in two iterations,
represented as:
108->513->153
Computing all orbits of 3N up to 10**5 reveals that the attractor
153 is reached in a maximum of 14 iterations. In this code we
show that 13 cycles is the maximum required for all integers (in 3N)
less than 10,000.
The smallest number that requires 13 iterations to reach 153, is 177, i.e.,
177->687->1071->345->216->225->141->66->432->99->1458->702->351->153
The resulting large digraphs are useful for testing network software.
The general problem
-------------------
Given numbers n, a power p and base b, define F(n; p, b) as the sum of
the digits of n (in base b) raised to the power p. The above example
corresponds to f(n)=F(n; 3,10), and below F(n; p, b) is implemented as
the function powersum(n,p,b). The iterative dynamical system defined by
the mapping n:->f(n) above (over 3N) converges to a single fixed point;
153. Applying the map to all positive integers N, leads to a discrete
dynamical process with 5 fixed points: 1, 153, 370, 371, 407. Modulo 3
those numbers are 1, 0, 1, 2, 2. The function f above has the added
property that it maps a multiple of 3 to another multiple of 3; i.e. it
is invariant on the subset 3N.
The squaring of digits (in base 10) result in cycles and the
single fixed point 1. I.e., from a certain point on, the process
starts repeating itself.
keywords: "Recurring Digital Invariant", "Narcissistic Number",
"Happy Number"
The 3n+1 problem
----------------
There is a rich history of mathematical recreations
associated with discrete dynamical systems. The most famous
is the Collatz 3n+1 problem. See the function
collatz_problem_digraph below. The Collatz conjecture
--- that every orbit returns to the fixed point 1 in finite time
--- is still unproven. Even the great Paul Erdos said "Mathematics
is not yet ready for such problems", and offered $500
for its solution.
keywords: "3n+1", "3x+1", "Collatz problem", "Thwaite's conjecture"
"""
import networkx as nx
nmax = 10000
p = 3
def digitsrep(n, b=10):
"""Return list of digits comprising n represented in base b.
n must be a nonnegative integer"""
if n <= 0:
return [0]
dlist = []
while n > 0:
# Prepend next least-significant digit
dlist = [n % b] + dlist
# Floor-division
n = n // b
return dlist
def powersum(n, p, b=10):
"""Return sum of digits of n (in base b) raised to the power p."""
dlist = digitsrep(n, b)
sum = 0
for k in dlist:
sum += k ** p
return sum
def attractor153_graph(n, p, multiple=3, b=10):
"""Return digraph of iterations of powersum(n,3,10)."""
G = nx.DiGraph()
for k in range(1, n + 1):
if k % multiple == 0 and k not in G:
k1 = k
knext = powersum(k1, p, b)
while k1 != knext:
G.add_edge(k1, knext)
k1 = knext
knext = powersum(k1, p, b)
return G
def squaring_cycle_graph_old(n, b=10):
"""Return digraph of iterations of powersum(n,2,10)."""
G = nx.DiGraph()
for k in range(1, n + 1):
k1 = k
G.add_node(k1) # case k1==knext, at least add node
knext = powersum(k1, 2, b)
G.add_edge(k1, knext)