1 / 20

Local Monitoring Module (LMM)

Local Monitoring Module (LMM). Author: Anna Bekkerman abekkerm@ecs.umass.edu. Managing LMM’s Setup. When LMM is started the following components are created: LocalServer and Sender Functionality: Communicate with RAPIDS server CurrentSetup

chione
Télécharger la présentation

Local Monitoring Module (LMM)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Local Monitoring Module (LMM) Author: Anna Bekkerman abekkerm@ecs.umass.edu

  2. Managing LMM’s Setup • When LMM is started the following components are created: • LocalServer and Sender • Functionality: Communicate with RAPIDS server • CurrentSetup • Functionality: Contains parameters of the current monitoring setup • Metrics, processes, update rate etc. • DBManager • Functionality: Concurrent access to the monitoring setup CurrentSetup DBManager Sets up parameters Requests current parameters LocalServer commands LMM signals, events Sender Sends signals, events

  3. Managing LMM’s Setup • Requirement: provide control over the level of monitoring intrusiveness • Solution: dynamic modification of the monitoring setup • Design: • During the experiment user sends setup commands • In the beginning of each collection session LMM requests current setup parameters • DBManager handles setup modification commands as well as LMM’s requests • Dynamic setup modification has not been implemented in the current version of RAPIDS

  4. Start-up Procedure • Start network control application (if needed) • Initialize RAPIDS Message Queue (RMQ) • Start heartbeat application • Launch processes

  5. Network Control • Requirement: simulate packet drop rates and transit delays on links between radar nodes and the SOCC • Solution: use iptables to forward packets to user-land application that will delay or drop them SOCC SOCC1 SOCC2 Radar node Radar node Radar node

  6. Network Control: Implementation • Configure iptables to use the QUEUE target SOCC emmy2 SOCC1 SOCC2 /etc/sysconfig/iptables on emmy5.casa.umass.edu : … -A INPUT -s 128.119.245.36 -j QUEUE … Radar node Radar node Radar node emmy2.casa.umass.edu emmy5 All packets coming from emmy2 will be queued!

  7. Network Control: Implementation In order to unload the ip_queue module do: > /sbin/modprobe –r ip_queue In order to stop iptables do: > /sbin/service iptables stop • Start iptables • Load the ip_queue module that forwards packages to the user space > /sbin/service iptables start Applying iptables firewall rules: [ OK ] These commands should be executed under root login! > /sbin/modprobe ip_queue > /sbin/lsmod Module Size Used by ip_queue 14553 0 …

  8. Network Control: Implementation • When LMM is started it will launch the simuwan_usr application that processes forwarded packages • Problem: simuwan_usr must be started by root • Solution: use sudo utility to start the application • The utility is used to run commands with the root user's privileges

  9. How to Set Up the sudo Utility • On the machine X edit the sudoers file (as root): • Specify commands that should be executed under root login: • Now, while running on Xunder someLogin, LMM should be able to start/stop simuwan_usr > /usr/sbin/visudo someLogin X = NOPASSWD : /usr/share/rapids/bin/ simuwan_usr, /usr/bin/kill

  10. More on simuwan_usr Application • Two types of action can be applied to the packets • Delay for either constant or variable amount of time • Drop according to the specified drop rate • Packets that are not dropped can be delayed still • Uses Glib’s event loop to process forwarded packets • Each packet is an event • Each event should be assigned a verdict: ACCEPT or REJECT • Each event can be assigned a timeout before it is dispatched

  11. RAPIDS Message Queue (RMQ) • RMQ employs Unix message queues to store Messaging and Application events • Events are generated: • Through wrapped library function calls • Using RAPIDS API • RMQManager: • Creates/removes RMQ • Periodically retrieves events and prepares them for sending to the RAPIDS server Application 1 Application 1 Function call LMM RMQ RMQManager To server

  12. Network Monitoring • Requirement: monitor status of links between SOCC and radar nodes • Solution: send “I’m alive” messages from radars to the SOCC • Drawback: false alarms

  13. Network Monitoring: Implementation • heartbeat_socc application is started on the SOCC node • If there is more than one node in the SOCC, the first one specified in the configuration file is chosen • heartbeat_sensor applications are started on the rest of the nodes • SOCC periodically pings nodes SOCC SOCC1 SOCC2 Radar node Radar node heartbeat_socc heartbeat_sensor

  14. Network Monitoring: Implementation • If node X replays, SOCC generates Variable event: “X=true” • If node X does not replay, SOCC generates Variable event: “X=false” • When RAPIDS server receives false event for node X, it reports failure for connection SOCC ↔ X LMM RMQ To server RMQManager SOCC SOCC1 SOCC2 “Variable” event Radar node Radar node heartbeat_socc heartbeat_sensor

  15. Launching Processes • User provides commands to start/stop processes in the configuration file • RAPIDS server sends these commands to LMMs while setting them up • LMM writes commands to a script • Scripts are created in the home directory and deleted at the end of the experiment • Script name is start_commands/stop_commands followed by the sequence number of the node where the script is executed • SOCC nodes have 1-digit sequence numbers: 1, 2, 3 … • Sensor nodes have 3-digit sequence numbers: 100, 101, 102 … • Starting script is executed in the beginning of the experiment • Stopping script is executed when LMM receives stop signal

  16. Sender void send(Bucket *b) CommandProvider vector<Command *> commands(Bucket *s, Bucket *p, Bucket *e) CommandExecutor vector<Command *> commands; void execute() Collection Session: Class Diagram RMQManager SyncBuffer DBManager CollectionSession executes creates void start() starts SystemBucket ProcessBucket Bucket EventBucket

  17. Bucket virtual int sizeInBytes() = 0 virtual int writeContent(unsigned char *buf, int offset) const = 0 virtual void addMetric() Collection Session: Algorithm • Create three Buckets for storing system metrics, process metrics and events • Generate commands using CommandProvider • CommandProvider requests current set of monitored metrics from DBManager • Depending on the current setup different set of commands will be generated

  18. Collection Session: Algorithm • Run CommandExecutor • System/process metrics: • Each command reads current value of a certain metric • For example: CPU utilization, workload etc. • Command writes metric values to a bucket • Events: • RMQManager inserts events into a SyncBuffer • Special EventCatcher command retrieves events from the buffer and puts them into a bucket • Send events/metrics to the RAPIDS server using Sender

  19. Commands: Class Diagram Command Bucket *bucket; void store() void start() CPUUsage Workload EventsCatcher MemoryUsage Ps

  20. Commands: Implementation • CPUUsage • Reads values from /proc/stat • MemoryUsage • Reads values from /proc/meminfo • Workload • Reads values from /proc/loadavg • Ps • Looks through all process subdirectories in /proc • Reads the filename of the process from /proc/[pid]/stat • Stores information about processes whose names were provided in RAPIDS configuration file

More Related