Render Farm Grid Data Optimization Solution
PRODUCTIVITY OF RENDER FARMS
CGI companies deploy CPUs in clusters or grids, going up to several 100s, in order to provide the computing power required to render computer generated imagery (CGI), required for film and television projects. Even with sophisticated computing, rendering 3D or 2D models into graphic images is a time consuming and compute intensive activity and poses several concerns for administrators and IT leaders. Efficiency or getting the best performance from the available resources is certainly a big one amongst them. There are other concerns driven by industry trends - data sets becoming larger, jobs becoming more complex, more collaboration in the projects than before, delivery deadlines becoming tighter and of course, budget spending becoming smaller. Small wonder then, that there has been a demand to make the render farms more efficient and productive.
Typically, in an animation company, a group of animators collaborates over a movie project. Each one of them designs different frames and scenes. Over the course of the day and weeks, hundreds of jobs are deployed across the different servers in the grid cluster. The same files are repeatedly read creating a large read I/O load over the network and the network storage. At crunch times the large volumes of NAS read operations slow down the NAS. This not only affects the existing jobs, but also creates a ripple effect across the grid. The NAS speed started to fluctuate from a typical 70 MBps to 20 MBps.
Overcoming these performance bottlenecks by over provisioning millions of dollars’ worth of NAS hardware, is the usual approach. Besides becoming prohibitively expensive, this approach introduces newer problems – that of load balancing. Data management continues to be a challenge. Data in the system is maintained far away from the jobs and has to be retrieved over the network. This increases I/O latency, high cost IOPS, slowed applications down considerable. Faster caches can be used locally, but the pay off is not always consistent and the cost of IOPS is always a question. For administrators to optimise their data delivery system for higher speeds, requires for them to know their data better. For instance, they would need to know which files were most frequently used and so could be cached for speedy access or which jobs or applications required frequent storage access and which didn’t. Without this knowledge, optimising their data delivery systems for efficiency and productivity is never easy.
FLEXIBLE CACHE DEVICES
Cheap and expensive SSD options are also available for expansion. The cheap ones can be connected as SAS SSD or the expensive ones as PCI SSD. However, the acceleration is dependent on the type of caching device used. The table below provides a rough ball park of actual numbers with different local and network devices.
|Caching Device||Performance Characteristics|
|Single SATA Drive||80 MBps|
|Single SAS Drive||110 MBps|
|Single SAS SSD||240 MBps|
|Raid 0 two SAS SSD||460 MBps|
|PCIe SSD||900 MBps|
|NAS 1G||20-70 MBps|
THE I/O FINGER PRINT OF THE RENDER JOB
Top 10 Read hits Inodes
A principal problem for administrators in responding to these challenges is that of understanding what is happening at the I/O level, in their grid environment. Administrators don’t have any information available about the I/O patterns of the workload. This is where DATAGRES provides value to administrators. PerfAccel software in the analytics mode provides insights to the I/O finger print of each job. PerfAccel software accrues such information and presents the data in a graphical format, that provides a view into an I/O trace for the whole application across time
DATAGRES software provides information about the different files that are used by the jobs and collects information on which parts of the files are hot or frequently used by the jobs. Over time, in a particular movie project the same scene files are used again and again.
THE DATAGRES PERFACCEL SOLUTION
PerfAccel gives administrators, for the first time, a point of view of the active data and its dynamics in their grid environment. PerfAccel, not only gives visibility of active data, but, through an analytic mode, it gives administrators the power to understand, to make sense of these dynamics and also the power to control these dynamics, to increase efficiency and acceleration of the application.
The deployment and configuration on the first batch of 60 servers takes less than 2 hours of time. DATAGRES also provides simple command line usage and a Single Pane of Management GUI with performance dashboards. The PerfAccel software deploys inside the Linux kernel. Data analytics and a single point of management is a great value proposition of the solution.
EFFECT ON CACHE BEHAVIOR
RETURN ON INVESTMENT
Initially, the data is read from the source and stored in the cache devices. Over a period of three days, two-thirds of the data is serviced from the cache. This translates into network data savings within only three days of observation. More importantly, the extra IOPS on the back-end NAS server increases the coverage ratio of number of servers per NAS.
In some cases, savings are estimated to be ~$200,000 of Tier 1 storage vendor per year, including maintenance and management costs. Acceleration from the local SATA drives in some cases provides a gain of 40 servers per year estimated at ~$200,000 every year.
DATAGRES’ PerfAccel provides administrators with full command and control of the data in the grid through a single console. A single pane combines analytics and insight through performance dashboards, as well as a simple command line, for running commands i.e. creation of cache/source, deletion of cache/source, adjusting sizes and so on. Its flexible interface lets users configure their own policies of persistent cache, pre-fetching, predictive cache, real-time cache size configuration and auto-caching hundreds of NFS mount points.
PerfAccel Commands are easy to use and an administrator could learn them in a few minutes. One of the big advantages of PerfAccel solution is to enable hardware agnostic cache devices. It gives system administrators a choice to cautiously upgrade the grid to faster local storage as budgets allow. PerfAccel provides flexibility in configuring data management options and seamless working with the cache devices independent of their type and location.