How to find the working set size using PerfAccel analytics
Hot data is the most popular term being heard from many technology evangelists. It is typically critical data that is accessed frequently. The working set size is the amount of hot data accessed in a given interval of time in a fixed environment.
The reasons why architects and CIOs wants to keep track of working set size is to
• Aligning the solution
• Determining workloads of solution
• Operating efficiency of entire solution and finding new bottlenecks
• Planning and scalability of solution
• Maintenance Patterns
There are some challenges in getting the right Working set size. It has to be a continuous process of monitoring, analyzing and understanding. Engineers face difficulties in getting the correct values. The reasons are
• Sampling and Modes of working set size
• Understanding the working set size at different layers
• Understanding working set size in Application
• Understanding Working set size in Storage and Solution
PerfAccel is an analytics driven software from DATAGRES Technologies Inc., which provides comprehensive data intelligence and dynamic data management within mission critical technology environments. Strategically placed where the entire data flow is transparent, PerfAccel operates in the I/O path below application level and above the storage subsystem, thereby maintaining an application aware I/O fingerprint.
Diagram 1: How PerfAccel Works with different layers
A typical deployment requires a directory/disk device which will hold analytical/caching details and will further consolidated on a single node i.e. management node. A Graphical representation of data movement across clusters can be seen.
The below diagram provides an overview of the deployment comprising of a PerfAccel Management node which collects analytics data from all other Application nodes
Diagram 2: Types of PerfAccel installations
There are many variants in a deployment which includes
• License Server which is used for communication with DATAGRES license server
• Management Node which collects all the intelligent analytics data from the client’s nodes and represents it in graphical format
• Client Node that will host either a single application or a series of applications which needs to be monitored, analyzed and accelerated.
PerfAccel Cache is a feature which will accelerate applications by providing local node caching mechanism. The cache operates in two ways Caching
The application’s hot data is intelligently cached and I/O pattern monitored. On subsequent re-reads, the data gets served from the cache itself thereby providing a faster access. Analytical Caching
This mode will maintain all the analytical information which can be used for gathering intelligence across cluster/grid deployments.
PerfAccel can monitor a source or process(es) accessing data from the source or even process group for providing a greater in-depth of analytics. Some of the elements captured are as niche as; Source write latencies, queue length, Source Flush Latencies etc. which is not possible to get in large scale management.
The easiest way of getting an in depth analysis is through GUI where you can view data flow for the chosen time interval. Using the GUI, one can get the data flow analysis on Grid, Node, Context and Process Group level.
Diagram 3: Graphical Interface showing data flow in grid
The GUI has insights to popular applications such as on cassandra, it gathers details on Keyspaces, Column Families, SS Tables and Compaction files etc.
Another way of understanding working set size is through the command line interface.
Diagram 4: Command line interface to show node level details
Both, command line and GUI, provide analytics data on a grid, node or even much fine grained levels like process level, to get the complete behavior patterns
For any business solution to be successful, it has to be aligned to its resources. All decision making ability for monitoring, maintenance and scaling of solution requires keeping track of all environmental changes and utilization of resources. Collaboration of these results from every component and compiling them to get a measurable and predictable output should be based on real time data. These in-depth insights of solutions helps unlock the complexity of unknown parameters and brings order with modularization of components. PerfAccel helps you get these insights on chosen time interval with least complexity and efforts.