Last updated on May 25, 2020
The recently published Technical Report on Cloud Computing from UC Berkeley lists 10 obstacles/opportunities related to cloud computing, of which “Data Transfer Bottleneck” ranks #4. As I wrote in my earlier post, network bandwidth is widely considered to be a critical resource for heterogeneous computing.
The obstacles discussed are multi-dimensional. Besides data transfer bottlenecks, they include lack of availability of service, concern for data confidentiality, unpredictable performance, limits in storage scalabilty and software licensing complexities. For now, we will focus on the obstacle #4.
Data Transfer Bottleneck
The report discusses an interesting aspect of data transfer challenges when applications are “pulled apart” across cloud boundaries. It costs $100-150 to transfer one terabyte of data. It says that the cheapest way to transport such large amounts of data might be to physically ship disks or even whole computers! As an example, “If we … sent ten 1 TB disks via overnight shipping, it would take less than a day to transfer 10 TB and thecost would be roughly $400, an effective bandwidth of about 1500 Mbit/sec.” With disks becoming denser and cheaper, this only gets cheaper with time, so this may be one interesting opportunity.
Another opportunity would be to organize data in such a way that related elements of data are stored in close proximity. This would require a combination of applications and other methods to optimize storage in such a way.
How about the cost of networking?
The authors say,
“A third, more radical opportunity is to try to reduce the cost of WAN bandwidth more quickly. One estimate is that two-thirds of the cost of WAN bandwidth is the cost of the high-end routers, whereas only one-third is the fiber cost. Researchers are exploring simpler routers built from commodity components with centralized control as a low-cost alternative to the high-end distributed routers. If such technology were deployed by WAN providers, we could see WAN costs dropping more quickly than they have historically.”
Further, they add, intra-cloud networking bandwidth can be a significant bottleneck. Typically, an array of processors within a rack are connected through a top-of-rack switch to second level switches, routers, storage area networks, WANs and the internet. Today, 1 Gigabit ethernet is most widely used for lower levels of aggregation. For many parallel applications that require bursts of data to be sent between nodes, this can be performance limiting; this can be a factor that discourages many scientists from using cloud computing. The current cost of 10 Gigabit links makes them prohibitively expensive for servers, and they are only used for link aggregation. As these become cheaper in the coming years, they can be deployed inside the clouds, thus significantly reducing latencies. As 40G and 100G links become available in 2010 and beyond, they are expected to be deployed in higher aggregation layers, enhancing inter-cloud network performance.
As the computing clouds get bigger, it is natural that the networking components that hold the computing elements together become increasingly critical; they not only affect how much processing speed-up we can achieve using a large number of processors, but also how much it costs to deploy and use such compute farms.