Security:
Distributing employee access rights levels (via AWS multi-accounts);
Distributing employee database access rights levels;
Distributing microservices’ access to database tables
Monitoring and
logging:
Load and QPS for PostgreSQL;
Load and QPS for Redis;
Load and QPS for AWS Lambda;
Configured indexes in ElasticSearch;
Grafana dashboards that help manage all loads and network activities in the Kubernetes cluster:
Dashboards with business metrics (RPS on the backend and others);
Threshold values limiting alerts + automated notifications through multiple connection channels
Horizontal scaling:
Support of gRPC connections balancing between microservices;
Configuration of Horizontal Pod Autoscaler + Cluster Autoscaler + Pod Disruption Budget;
Load testing;
Connection Poolers (PgBouncer) for PostgreSQL;
Master + ReadOnly Replicas for PostgreSQL;
Caching;
Backups
Monitoring:
Installation of Prometheus monitoring and deployment of Prometheus operators in Docker;
Deployment of monitoring and Grafan using Ansible with following configuration and deployment of the alert manager;
Integration with the general Ansible playbook;
Addition of rules to custom metrics for services such as PostgreSQL, PgBouncer, Kafka, ClickHouse, ScyllaDB, HaProxy, etc