From: mdford2 SpamElide Sent: Thursday, March 10, 2011 12:26 PM To: Gupta, Indranil Subject: 525 review 03/10 Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Microsoft builds a file system to automatically handle remote backup while balancing performance and data loss. The domain they target has a spectrum of solutions; the three most common are synchronous, semi-synchronous, and fully asynchronous. The more stringent the synchrony requirements, the stronger the guarantee of safety. But at the same time, performance degrades with link latency, making the synchronous solution unsuitable for remote backup in fast applications. The SMFS proactively adds transmission redundancy to raise the probabilistic safety guarantee past that of other semi-synchronous models. SMFS is not as safe as a synchronous model, but the throughput is much better. Microsoft uses a log-structure file architecture, and adds support for atomic multi-writes, or batch commits. The authors test their implementation using Emulab and a ring testbed. They evaluate their solution against previous ones by measuring data loss, latency, and throughput. In general, the Microsoft system achieves comparable performance to other configurations while reducing data loss and minimizing message overhead. RACS: A Case for Cloud Storage Diversity The authors present the concept of using RAID-like techniques to stripe data across cloud storage providers. This helps reliability, avoids vendor lock-in, and reduces the cost of switching providers. The authors argue that there is a "data inertia" due to the double cost to move data from one provider to another. RACS decreases this, but simply splitting data between providers would achieve this as well. Then, the advantage of RACS is combining this with reliability. However, recreating the data can be extremely expensive due to the transaction costs associated with pulling all of the data from all cloud providers. There should be a trade-off to simply replicating the data, which is not addressed in the paper. Moreover, it is unclear if there are enough storage providers to make this approach feasible. Cloud providers differentiate themselves in features and cost. This system cannot use advanced features of a provider since it must interact with the lowest common denominator. The users are then paying extra for reliability on top of what the cloud provider guarantees. From: Anupam Das [anupam009 SpamElide] Sent: Thursday, March 10, 2011 12:20 PM To: Gupta, Indranil Subject: 525 review 03/10 i. Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance This paper highlights the tradeoff of between performance and availability in mirroring files across geographically distributed datacenters. Traditionally there are two ways in which mirroring can be done namely: Synchronous Mirroring and Semi-Synchronous or Asynchronous Mirroring. Synchronous mirroring achieves high level of reliability but has high latency. On the other hand Semi-Synchronous or Asynchronous mirroring has low latency (high performance) while achieving lower reliability.  SMFS (Smoke and Mirrors File System) tries to merge these two extremes by reducing the loss rate of packets on the wide area network. They do this by inserting a FEC (Forward Error Correction) code along with the original data packet. So packets can be reconstructed from these FEC codes and thus increase availability while allowing the primary site to move ahead asynchronously. SMFS uses callback to notify the local sites to go ahead with the hope that FEC packet will be able to solve packet loss/corruption in the remote site.  Another feature of SMFS is mirroring consistency, which is achieved through preserving the order of operations in a distributed log-structured file system. SMFS also maintains group consistency where files belonging to the same file system are updated as a group through a single operation and these updates are reflected in the remote mirror site atomically. The paper compares SMFS with other strategies and shows that it can achieve low latency even in the presence of failures. Pros: 1. The system is implemented in the gateways (routers) and therefore enforces no changes to the existing network. This makes it highly deployable and generic. 2. Attains low latency (high performance) while improving reliability. 3. Proactive error recovery is a novel idea and it turns out to be beneficial for mirroring across remote sites. Cons: 1. This system is suited to applications that are willing to sacrifice total reliability for low latency. 2. Using FEC does not guarantee total recovery as it is unable to recover when most (certain patterns of data loss) of the data is lost or corrupted. TCP retransmission is very slow and even that can fail if the source node crashes. 3. There is an overhead associated with sending FEC codes. It would have been interesting to see how much overhead was generated in the experiments. 4. The system relies on high speed wide area links among the remote sites which is not always practical. -------------------------------------------------------------------------------------------------- ii. Volley: Automated Data Placement for Geo-Distributed Cloud Services Datacenters (DCs) today are geographically distributed across the globe for the purpose for fault tolerance. But they need to exchange/share data with other datacenters as well as clients. This paper aims to reduce both latency and costs associated with this inter datacenter traffic by proposing a heuristic strategy for data placement to these geo-distributed datacenters.  Now, the simplest solution to this data placement problem is migrating data to the DC closest to the user. However, there are number of issues likes- datacenter capacity, data interdependencies, user-perceived latency and user mobility which make this decision complicated. So, this paper proposes a heuristic algorithm called Volley to determine a near optimal placement of data to DCs. Volley takes as input the request logs for data in the system (it can also take other inputs like DC capacity, latency) and it analyzes them to provide outputs that indicate where to best place the data. Volley is an iterative algorithm. In phase 1, Volley computes an initial placement based on client IPs. In phase 2, Volley iteratively moves data to reduce latency. Finally in phase 3, Volley iteratively maps data items into datacenters taking into account the datacenter capacities. Evaluations show that Volley converged after a small number of iterations, and reduced datacenter capacity skew by 2x, inter-datacenter traffic by 1.8x, and latency by 30%. Pros: 1. It considers a number of factors in deciding the placement of data. All these factors are realistic and have a direct impact on the optimal placement. 2. It automated the data placement process in the background. 3. Reduces reduced datacenter capacity skew, inter-datacenter traffic and latency. Cons: 1. Volley uses IP address to locate users but users could Proxy IP address or VPN or Tor network in which the IP address would not indicate the real geo-location of the user. Maybe using RTT along with geographic location could be used to get a better idea. 2. Oscillation in data movement could lead to unnecessary migration 3. Each run of Volley takes a significant amount of time (almost 16 hours), so results in heavy computation cost. ----anupam From: Igor Svecs [isvecs SpamElide] Sent: Thursday, March 10, 2011 12:15 PM To: Gupta, Indranil Subject: 525 review 03/10 Igors Svecs CS525 paper review 2011-03-10 RACS: a case for cloud storage diversity SUMMARY This paper generalizes RAID to cloud storage by having cloud storage requests go through a proxy that transparently translates regular queries into RACS queries. Stated goal is to prevent vendor lock-in, when the cost to migrate from one provider to another may be too high and as a result companies will remain locked in with suboptimal providers (“storage inertia”). Similarly to RAID, RACS also provides some tolerance against cloud storage provider outages. RACS uses erasure coding somewhat similar to RAID-5. It is implemented as a python service that emulates Amazon S3 interface to the clients, and uses provider-specific adapters. These adapters map provider-specific API to basic Amazon S3 API. Evaluation shows no performance overhead, possibly due to parallelism. COMMENTS Main Criticism: We need to consider how cloud storage is being used in companies. In many cases it is done for load distribution when content is not used for internal purposes, but it is directly served to the client (via the web for example). Even their choses example of an “Internet Archive” supports that. In these cases it is used as a kind of CDN. However with RACS, traffic has to go through a central proxy in order to be assembled before serving it to the clients. In this case, inbound traffic (cloud -> company network) and outbound traffic (company network -> client) is pure overhead. RACS proxy is distributed so it may not necessarily create a bandwidth bottleneck, but it will impose additional cost. Other comments: I do not buy the argument that vendor lock-in is a significant problem in this case. For long-term backups / storage that would require lots of data, immediate availability is not required, and traditional cost-effective means (tapes) can be used. I doubt that the subset of data that actually needs to be online all the time is that large to impose a significant migration barrier. The interface and supported features are reduced to the lowest common denominator; only basic list/get/put/delete interface is supported by RACS. Some providers may compete on additional features, and discarding them seems to be unwise. Per-connection pricing is not considered with is a significant drawback. As we know from previous reading in this course, there is a minimal per-connection cost, so it’s inefficient to download many small files. When data is striped across multiple providers, it is more likely that average requested file size will be reduced. Also, authentication should have been implemented. Other thoughts: another primitive needs to be added to the traditional list/get/put/delete interface: get_web(file, principal), where an arbitrary principal (for example, anonymous user) can get a file via the web. This primitive is supported by many providers and is used by many subscribers, so it is unreasonable to ignore it. Volley: Automated Data Placement for Geo-Distributed Cloud Services SUMMARY Volley is a system that is responsible for geographical distribution of objects across multiple data centers. Two parameters to be minimized are WAN bandwidth between data centers and user latency. While read-only data can be easily handled by CDNs, this work focuses on shared data that contains inter-dependencies; hence, Microsoft’s Live Messenger and Live Mesh traces were chosen for analysis. This paper also addresses user mobility. Volley works by analyzing request logs to the object, and moving the object between data centers. An algorithm that is implemented as a series of MapReduce-style jobs that periodically analyzes objects for migration opportunities is presented. First, initial placement of objects on the map is computed; second, iterative optimization/shuffling to reduce latency is done; third, objects are mapped to data centers to satisfy capacity constraints. COMMENTS This is an interesting paper that only could have been produced (and is directly usable) by select few organization with resources to own multiple data centers distributed across the world. All objects are assumed to be dynamic and are subject to the periodic algorithm. Volley could be extended to be more static-content friendly; that is, allow developers to specify what content should be excluded from the periodic job to reduce its running time. By not implementing replication and migration mechanisms, Volley remains flexible. For example, it does not consider when multiple replicas of the same object are stored in different data centers, and assigns different GUIDs to them. The disadvantage of this approach is that Volley places the burden of replicating objects and keeping track of master/copy relations to applications. I would still consider it a net advantage, as it enables Volley algorithm to be used to suggest migration to applications. It is useful when Volley is run by a hosting company (owner of data centers), and customer make decision whether or not to migrate data based on the cost. From: Curtis Wang [cwang89 SpamElide] Sent: Thursday, March 10, 2011 12:12 PM To: Gupta, Indranil Subject: 525 review 03/10 Curtis Wang (wang505) 3/10/11 Geo-Distribution Smoke and Mirrors The Smoke and Mirrors File System (SMFS) offers an alternative between synchronous, asynchronous, and semi-synchronous mirroring with their “network-sync” option. Instead of blocking until the data is safely mirrored (synchronous), only writing to local disk and immediately sending updates (semi-synchronous), or just writing to local disk (fully-asynchronous), the network-sync option uses redundancy at the network level (by sending error correction packets using Forward Error Correction) and performs callbacks when the error recovery data has been sent. To maintain consistency between replicas, SMFS uses a distributed log-structured file system (this is somewhat reminiscent of GFS). Pros - Provides more protection than current asynchronous and semi-synchronous solutions Cons - Error correction packets impose a small constant increase in network traffic The SMFS seems to be a viable solution only for companies that have extremely sensitive data. With the extremely low probability of disasters, companies like Facebook probably will not be interested in incurring the extra network traffic and costs to potentially save just a slice of activity on the social network. Also, I was expecting a little bit more from the system instead of simply adding FEC packets and callbacks. Volley: Volley is a system that analyzes logs of datacenter requests to compute migration recommendations using an iterative optimization algorithm. The algorithm takes into account shared data, data interdependencies, application changes and updates, datacenter capacity limits, and user mobility. The Volley algorithm consists of three phases: the first phase computes an initial placement of the data items in question, the second phase (which takes the most computational time) iteratively improves the data placement, and the third phase maps the data items to data center locations, taking into account datacenter capacity. Volley significantly outperforms three basic heuristics for data migration (place data as close to IP as possible, use only one datacenter, hash datacenters for optimal load-balancing). Pros - Migration is dependent on the datacenter application allowing flexibility - Can reduce datacenter capacity skew, inter-datacenter traffic, and user-perceived latency - Relatively low computational costs and has advantages when run frequently Cons - Companies must be at a large enough scale in order to take advantage of the system, e.g. must have multiple datacenters in geographically disparate locations already in place How would Volley work for migration of replicas or is that handled by the datacenter application? What would be the optimal placement of replicas? Also, it would have been interesting to hear more about how Volley could allow for datacenter placement recommendations (without rerunning the algorithm several times with new locations in mind), which is mentioned as a possibility in the conclusion of the paper From: trowerm SpamElide on behalf of Matt Trower [mtrower2 SpamElide] Sent: Thursday, March 10, 2011 11:45 AM To: indy SpamElide Subject: 525 review 03/10 Smoke and Mirrors This paper discusses the implementation of a new protocol for geographically distributing data across servers. Previous work has been based on local sync, a performance-centric unreliable mechanism, and remote sync, a reliable but slow mechanism. The authors strike a balance between these ideas with what they call network-sync. The idea is to send the data into the network immediately with FEC codes and treat this like it is already backed up remotely. Returning immediately gives the feeling of local sync with the FEC codes providing the reliability of remote sync. This work assumes a reliable network to act as storage for the inflight data. In congested areas, the FEC would prove meaningless if packets are being dropped. Perhaps this is the case if companies have SLA's with network providers, but this does not help with global distribution and is impractical in the long run. Volley Volley is a decision engine for data placement in distributed data centers with specific application to Live Messanger and Live Mesh, two of Microsoft's cloud based applications. Decisions are based upon several factors including: user perceived latency, intra data center bandwidth, inter data center bandwidth, and data center capacity. The volley algorithm makes this decision by making an initial guess, then repeating the process of improving data latency then mapping data to data centers. The paper also presents some interesting traces from their user applications such as 35% of conversations on Live Messenger happen over significant geographic distances. It is not clear that the algorithm will stabilize after so many iterations. Furthermore, the current deployment only runs every 14 hours. I would guess that inter-DC latencies are not stable over such a long time period. A redeeming quality of the work is its application in provisioning new data centers. If one data center seems to be always overloaded then new data centers should be built in that region. Applications like Live Mesh would have more private user data then sharing based websites. Distribution of this data might be more interesting as you can't just follow the owner's location. From: Andrew Harris [harris78 SpamElide] Sent: Thursday, March 10, 2011 11:34 AM To: Gupta, Indranil Subject: 525 review 03/10 Review of “RACS: A Case for Cloud Storage Diversity”, Abu-Libdeh et al, and “Smoke and Mirrors: Reflecting File at a Geographically Remote Location Without Loss of Performance”, Weatherspoon et al RACS is an attempt to leverage existing cloud storage systems while mitigating the effects of vendor lock-in and so-called “data inertia”. As described in the paper, data inertia is an effect of ever-growing masses of data when bandwidth costs are nontrivial; as the data mass grows, it becomes costlier to move it out of the original data system. Their approach to combating both of these problems with existing cloud storage is to stripe data across multiple cloud systems in a manner closely resembling RAID 5 and 6, through their RACS proxy (the number of parity disks in their system is configurable). They also describe a distributed RACS management system using Apache ZooKeeper for access control between multiple RACS proxies. The end result of their system is a fault-tolerant cloud-based backup system that is relatively trivial to relocate across cloud systems, that minimizes the cost of data inertia, and that is not substantively more expensive in active use than simply choosing a single cloud provider. The group only considered distributed RACS using two concurrent proxies, and reported that using two proxies introduced a negligible overhead to the transfer. I don’t feel this is a sufficient test, given the scalability claims for RACS. Consider latency on a put operation with multiple other RACS clients attempting to get the resulting file; all these clients must block in order for the one client to finish writing. If this were a large file, or a file that must be highly available (or both!), then the RACS system has introduced a significant bottleneck to this system. Consider also that as RACS proxies are added and as the amount of coordination between them increases, this negligible overhead quickly grows (it should be exponential). Distributed RACS then may be suitable for very small implementations, but should not be relied upon for systems requiring larger numbers of RACS proxies. Smoke and Mirrors (SMFS) takes a transparent volume mirroring approach to data redundancy. Where RACS relies on using many offsite backup locations, SMFS is designed for communication between few sites, with very fast (high bandwidth, low latency) connections between sites. Their use case is that of a system with a very large, constantly growing mass of data (the Cornell National Lambda Rail Rings facility), so SMFS also takes into account error-correction codes and uses what appear to be a series of delta updates to push out redundant information. The result is in their “network-sync” approach, which is able to handle large pushes of data with disaster handling that is at least as good as a local backup system. An assumption made by SMFS is the existence of high-bandwidth, low-latency links between data centers, which are necessary to handle the proposed volumes of information that SMFS is designed to address. While these sorts of links do exist, and are available to certain large data center hosts, they are not available to all hosts. Furthermore, it appears that these network links are assumed to be dedicated, direct links (or nearly direct links with minimal hops and specially designed datacenter routers) with no other applications sharing their pipeline. This is an unrealistic assumption if the SMFS system is to be implemented across a general network, as it would be in the case of a smaller company with large data needs. Though RACS is more generally applicable than SMFS, some of the error correction pieces in SMFS could be applied to RACS to guarantee data integrity across heterogeneous network links. As implemented, RACS seems to only guarantee data consistency once that data is in the cloud; in transmission, there is not an equivalently strong guarantee. The use of forward error correcting in SMFS could easily be applied in the network layer in RACS, totally transparent to the user, and would increase the confidence of a good transmission with minimal increase to bandwidth overhead. Their implementation of FEC is also designed to handle bursty communication, which would make it suitable for a wider range of applications that SMFS could handle with guarantees of transfer integrity. In the reverse direction, it may be that SMFS could incorporate multiple-site striping as borrowed from RACS to ensure both data redundancy and to cut down on single-site bandwidth needs, although the need for fast connections out of the originating site becomes an issue. It also would make the network transfer more susceptible to failure, as it introduces more points of failure from origin to destination, however this may be a non-issue by the RAID-like design of RACS.From: wzhou10 SpamElide Sent: Thursday, March 10, 2011 11:19 AM To: Gupta, Indranil Subject: CS525 Review 03/10 CS525 Review 03/10 Wenxuan Zhou Review on “Volley: Automated Data Placement for Geo-Distributed Cloud Services” Core idea The Volley paper presents an implementation for data placement for cloud service within multiple datacenters. This automated placement strategy, Volley, balances data volume across geographically distributed datacenters to reduce both inter data center traffic and user latency, by mapping the placement problem to a global optimization problem. Pros 1. One aspect I really like about the paper is the simple idea and implementation. 2. Volley lets application make the final decision for data migration, so application specific requirements can be used in Volley. Cons 1. Volley takes some historic data (logs) and some task requirements (RAM, CPU etc) as input. So one question is how long this history is. It’s not always the longer the better. 2. Volley only deals with the case where data and the services using the data are controlled by the same entity. But for the cases where services use some third-party data, there might be the situation that service and data try to optimize data location independently, and thus leads to oscillations. 3. Volley maps IP to geographical location, and then use physical distance as latency metric. However, I believe sometimes shorter physical distance does not necessarily mean smaller latency, using different connective media and taking queuing delay at routers into account. 4. In evaluation section. As we can see: the capacity skew is quite serious in Volley's first two phases. But after phase 3, latency becomes larger than that of phase 2. The paper does not point out when compared with Common IP strategy, result of which stage is used for reference. As we could see in previous tests, only stage 1, 2 are used. That could be unfair, since before mapping to actual data center, collapsing to nearby geographic place could lead to contention, I'm afraid, causing the nearest data center not capable to process such large amount of data, where the presumable latency model would fail to hold. Review on “Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance” Core idea This paper describes Smoke and Mirrors File System (SMFS), which creates backup of files at geographically remote sites. SMFS proposes to use new synchronization mechanism called "network-sync" which has two additional properties: 1) Forward Error Correction (FEC), and 2) Application Callback. In this way, SMFS is able to offer strong performance level and data reliability guarantee as well. Pros 1. The idea to add redundancy at the network level is novel, which ensure the performance while improving reliability. 2. The network-sync concept is general and is able to be used by many applications. 3. SMFS doesn’t rely on reactive TCP-based error correction mechanism, instead, uses proactive error correction. This improves fault tolerance and reduces network traffic as well. 4. SMFS minimizes jitter while files are updated in a short period of time. Cons 1. Though network-sync is better than semi-sync and asynchronous solutions in terms of reliability, it still not as safe as synchronous ones. So it might be better give the application the right to choose from those solutions. For instance, some application that requires high reliability and doesn’t care about the latency, may be willing to use synchronous solutions. 2. SMFS is dependent on network stability, which could be highly unreliable. 3. The assumption on high bandwidth wide area channels between remote sites makes SMFS less useful to small institutions. From: nicholas.nj.jordan SpamElide on behalf of Nicholas Jordan [njordan3 SpamElide] Sent: Thursday, March 10, 2011 11:17 AM To: Gupta, Indranil Subject: 525 review 03/10 Nicholas Jordan njordan3 3/10 Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Many companies backup their transactions data and records on remote sites to prevent data loss. These remote sites are called mirrors are exact copies are local primary storage. To communicate the updates, the existing methods are synchronous, semi-synchronous, and asynchronous mirroring. Synchronous mirroring only allows the main site to continue once the remote mirror has acknowledged the transaction has been saved. In this approach, the latency makes such that it is only feasible with the remote site being in close proximity like a state away. In the case of a rolling black out, this won’t be sufficient. Semi-synchronous and asynchronous try to alleviate the latency issue. However, they are both acceptable to losing transactions or not being consistent with a network partition. The paper talks about network-sync and compare it to the existing models of explains how this approach lessens the potential risk of data loss and improves with good performance. Pros: -more reliable than semi-sync mirror Cons: - still suffers from trade off of reliability vs performance Additional comment: N/A RACS: A Case for Cloud Storage Diversity This approach applies RAID 5 to the cloud. They split the data into m blocks and then they keep another block for error correction. Taking the storage total number to m+1 blocks. Much like the data is distributed among different actual disks, the blocks are distributed among other different cloud providers. Pros Response time with RACS(2,3) is faster than a single RackSpace slice Source Code is available under BSD license Able to not mitigate cloud hosting price increases Cons More expensive to do full data migration Bottle neck is slowest cloud provider Additional comments My only concern is with switching providers is not quite accurate. In may 2009, they compared that if you had stored all your data on one provider it would cost about $24. If you used RACS, it would cost about $3K (assuming per provider). So if you want to move all your data to a totally new set of hosts the actual cost is 9*$3K = $27k in the case of RACS(8,9). Another motivation to completely move your data, the country government is being overthrown (Egypt). I still think this is a good way to indirectly do geographically backup if you chose your provider -- Thanks, Nick Jordan From: muntasir.raihan SpamElide on behalf of muntasir raihan rahman [mrahman2 SpamElide] Sent: Thursday, March 10, 2011 10:41 AM To: Gupta, Indranil Subject: 525 review 03/10 Volley: Automated Data Placement for Geo-Distributed Cloud Services Summary: This paper addresses the challenges of automated resource placement in geo-distributed data centers where users want to minimize latency, and cloud providers want to minimize cost. The authors define the problem formally and then propose solutions that outperform some existing methods. Some of the important motivating factors for volley include: shared data, data inter-dependencies, application changes, reaching data center capacity limits, and user mobility. The volley algorithm takes as input the network cost model, DC capacity and locations, and request logs. It works in three phases. In the first phase, it calculates a geographic centroid for each data considering client locations. Next, it refines the centroid for each data iteratively considering both client locations and data inter-dependencies using a weighted spring model. Finally the centroids are mapped to actual data centers. Data center overload is handled by rolling over least used data. In practice, the number of iterations is equal to the number of data centers. Volley was compared against existing heuristics including hash, oneDC, and commonIP in terms of latency, inter-DC traffic, and frequency of migration. Pros: (1) Volley makes placement decisions in a coordinate space which are later collapsed to actual data center locations. If there are no nearby data centers for a confined space where a large amount of data has been migrated by Volley, then a nearby location to that data could be an ideal location to build a new data center. As a result, volley can be utilized for capacity upgrade decisions. (2) Volley works for geo-distributed data centers. (3) Considers both client-data latency and inter-data latency for dependent data items. Cons: (2) In the worst case, the solution might not converge and result in oscillations. (3) The rational behind using spherical weighted means for initial placement is not clearly explained in the paper. (4) The capacity skew in the first two phases is quite significant. (5) 14 hours of computation for each run of Volley is a-lot. Future Work: (1) Volley utilizes historical logs and task requirements to find an approximately optimal placement for data. The placement decision could be further improved by incorporating availability of resources in data centers and working migration paths. (2) A distributed volley solution might be interesting. (3) Volley could be enhanced to include data replication decisions. (4) The performance of volley could be improved by considering prediction of future in addition to past history when making placement decisions. Discussion Points: (1) Volley is basically a placement heuristic. VLSI Placement algorithms like FL and KL heuristics use the same approach of starting with a random placement and then iteratively trying greedy moves to improve performance. The question is how is Volley different from these existing algorithms, other than being applied to a different domain. RACS: A Case for Cloud Storage Diversity Summary: RACS is a RAID solution for clouds where multiple cloud providers are combined for data storage to reduce the risk of failures and single vendor lock in. At present, many customers are locked in with a single cloud provider due to the large cost of switching data, even if there are better options. To completely transfer a large amount of data from one vendor to another, customers have to pay the bandwidth cost for both vendors, which might be prohibitive. As a result, RACS proposes to use erasure coding across multiple clouds to reduce the cost of switching vendors. In erasure codes, each data item is partitioned into m blocks and the system generates n > m blocks such that any m out of n blocks can be used to reconstruct the data. The n blocks are distributed among n distinct repositories. RACS mainly deals with economic failures, e.g. price hike by a single vendor. Usually these types of failures can be predicted beforehand. The authors also propose a distributed version of RACS using ZooKeeper for coordination. They implemented a prototype of RACS and used the Internet Archive trace to evaluate the cost of initally moving to the cloud, the cost of vendor lock in, and the impact of price hikes. Pros: (1) Can deal with vendor lock in and price fluctuations. (2) Generalizes RAID (RAID 5 is RACS (5,4)) and pure replication (m = 1). Cons: (1) Although RACS can mitigate economic failures, it has performance overheads in terms of storage, number of requests, bandwidth, and latency. Future Work: (1) Can RACS be used for non-economic failures? (2) Building RACS with heterogeneous repositories. (3) Using RACS to deal with cloud burstiness. -- Muntasir From: Chi-Yao Hong [cyhong1128 SpamElide] Sent: Thursday, March 10, 2011 8:32 AM To: Gupta, Indranil Subject: 525 review 03/10 Chi-Yao Hong hong78 SpamElide ----Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance, FAST’09 ---- To improve the availability, it is common to replicate data across multiple geographically remote sites for fault tolerance. However, the existing solutions compromise performance. For example, the synchronous mirroring solutions block application when data is mirroring to the remote site, which introduce significant latency. Also, the system performance becomes sensitive to the link latency. On the other hand, semi-synchronous mirroring provides low latency by allowing the modification during mirroring, however, the potential packet loss could happen if disaster strikes the original site. These asynchronous schemes provide great performance, but have weak safety properties. This paper proposed a new mirroring scheme, network-sync, to improve the safety properties of semi- and a-synchronous schemes, while preserving their performance. It leverage in-network backup, which improve the data safety. Also, the authors propose to use feedback message to notify the successful delivery of recovery data. Therefore, the client could safely remove the data after receiving the feedback. The authors implemented the system on top of Emulab and NLR Ring testbed. Pros: 1. Unlike existing schemes, it can be shown that the performance of the network-sync is independent of link latency. 2. Using in-network buffer support is a novel idea to improve the end-host performance. Cons: 1. I failed to see why UDP transport protocol is a good choice for the design. It clearly compromised the data correctness. The forward error correction codes cannot recovery the corrupted data all the time. 2. I am not convinced that log-structure is optimized for network-sync. The evaluation does not show how much overhead the disk cleaning could impose on the system. ----Volley: Automated Data Placement for Geo-Distributed Cloud Services, NSDI’10 ---- In recent years, cloud service providers are increasingly building more data centers that are geographically dispersed. By replicating the service, the users could enjoy lower latency by connecting to closer data centers. Another concern is that service provider has to pay for the WAN bandwidth between data centers. The major challenges for this paper is to provide both low user latency for dynamic objects while considering the operator service cost, including the consumed WAN bandwidth between data centers and the provisioned capacity in these locations. Volley studied the real applications, including Microsoft Live Messenger and Live Mesh. They show that the user spreads around the world. Therefore, there is a need for geo-replicated servers to improve the service latency. Also, users continue using these services when they are traveling. Moreover, >35% users are communicate with distant users (> 1000 miles). Therefore, providing a continuous service is also an important issue to deal with. Volley performs automatic data placement by considering the above issues. Pros: 1. As compared with heuristics, Volley significantly reduced both the user latency and inter-datacenter traffic by solving a weighted spherical mean calculation iteratively. The authors verify the improvement by using real data traces from Live Mesh. Cons: 1. The whole design is inspired by the analysis on the Live Mesh and Live Messengers. I doubt it is general enough for other cloud services. For example, online gaming would have more strict latency requirement, while file hosting service would have high bandwidth requirement. Cloud service requirements could be very different and application-specific. Therefore, Volley might perform suboptimal for other service. -- Chi-Yao Hong PhD Student, Computer Science University of Illinois at Urbana-Champaign http://hong78.projects.cs.illinois.edu/ From: Agarwal, Rachit [rachit.ee SpamElide] on behalf of Rachit Agarwal [agarwa16 SpamElide] Sent: Thursday, March 10, 2011 8:03 AM To: Gupta, Indranil Subject: 525 review 03/10 ---- Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance The paper tries to find an operating point of interest in the trade-off space of latency and mirroring reliability. The authors argue that using an error correcting code over the transmitted data could improve the mirroring reliability while allowing the primary site to respond quickly to the user. I found this particular application of error correcting codes interesting. I believe it would be interesting to relax a number of assumptions made in the paper. In particular: 1. The authors assume dedicated links between the primary and the remote site. The problem becomes extremely simple if the two sites have dedicated links between them; the solution may have wider applicability if this assumption is relaxed. 2. It would have been interesting to see some discussion on why authors found this particular point interesting; it is not entirely clear to me that having FEC would improve much on semi-sync mirroring without too much overhead in terms of amount of data transfer. 3. An apparently interesting problem may be to actually find the location at which one should mirror in order to achieve a good trade-off point. 4. How complex are the egress and ingress router operations? ---- Volley: Automated Data Placement for Geo-Distributed Cloud Services Automating the placement of data across geographically distributed data centers in order to minimize user perceived latency is a complicated problem. With increase in amount of shared data, and in the number of users accessing the same data increases, the problem only complicated further. This paper takes an initial stab at the problem, and designs a system that places the data while taking data inter-dependencies, application changes and user mobility into account. Most of the observations made prior to system design are from two traces: Live Mesh and Live Messenger. The data demonstrating data interdependency and mobility of clients is particularly interesting. In general, I like the approach of motivating the paper using real-world observations. That said, the paper seems to be limited in analysis and evaluation. In particular, I have the following concerns: 1. The latency we talk about in this paper is wide-area latency, which varies arbitrarily with time. Hence, placement schemes that do not use real-time latency may be inefficient. That said, measuring real-time latencies is a simple task, and I believe it could be easy to integrate this with the system. IMHO, it is not enough to just consider geographic locations. 2. I am not completely sure if Volley is "stable" in the sense that it avoids oscillating. 3. It could have been really interesting to see evaluation results on real set of data with real users in real time. The simulation set up of the paper is quite a bit simplified and in general, may not be representative. For instance, the authors use only 20 virtual machines across the data centers. ---- Best regards, Rachit -- Siebel Center for Computer Science, University of Illinois at Urbana-Champaign, Web: http://agarwa16.wikidot.com ---- Sing as if No one is Listening, Dance as if No one is Watching, Dream as if You'll Live Forever, Live as if You'll Die Today !! From: david.m.lundgren SpamElide on behalf of David Lundgren [lundgre4 SpamElide] Sent: Thursday, March 10, 2011 2:46 AM To: Gupta, Indranil Subject: 525 review 03/10 RACS: A Case for Cloud Storage Diversity Abu-Libdeh et al. apply the design principles and lessons learned from RAID to cloud data storage providers to produce RACS (Redundant Array of Cloud Storage). The need for a RAID-esque system amongst cloud data stores is driven by two distinct types of "failures:" outages (i.e. periods of data unavailability due to unforeseen events) and economic failures (i.e. market fluctuations, for example: price, that cause the service of a provider to become extremely undesirable). Economic failure seeks to characterize problems associated with vendor "lock-in," or the inability to readily and cost-efficiently switch among service providers. RACS is designed as a proxy service and, for simplicity, uses the same data model as Amazon's S3 service. RACS accepts data from the client and then splits the data into data shares, uses erasure coding to generate redundant shares, and then redistributes these redundant data shares to each of RACS's repositories, striping the data across multiple repositories. The RACS proxy itself is a distributed system that coordinates synchronization through Apache ZooKeeper. RACS uses wrappers to map the APIs of various cloud storage providers to their own API. The authors then provide a theoretical overview of performance overhead before moving on to describing an economic case study of moving the Internet Archive to the cloud. Various hypothetical situations are posited and RACS is shown to reduce costs. Empirical benchmarks for performance overhead are then detailed. Pros, Cons, Comments, and Questions: 1. I take issue with the empirical section of the paper. Running RACS on the same computer as the client is not a good approximation of the real world (or at least does not fit with the purported "distributedness" of RACS). 2. The economic motivation for RACS is insightful and the applicability of RAID's principles to the problem seems to provide an elegant solution. However, as the authors mention, there is not a great enough diversity among cloud providers to make such a protocol feasible. 3. Reliance on a single RACS proxy for data striping across cloud storage providers leaves clients vulnerable to RACS vendor lock-in. ---------------------------------------------------------------------- Volley: Automated Data Placement for Geo-Distributed Cloud Services Agarwal et al. introduce Volley, a system that analyzes cloud services' log data and produces data migration plans. Volley is designed to output such recommendations in the face of varying WAN bandwidth and associated costs between data centers, data center usage requirements, and other contractual and legal obligations of data sharing among data centers. The authors first analyze commercial cloud-service traces. Data sharing between geographically distributed components and the mobility of clients for specific applications are shown to resemble long tail distributions. Volley periodically reviews log data in the Cosmos distributed storage system submitted by application servers in geographically diverse data centers. With each analysis Volley outputs application-specific migration policies, leaving clients to implement their own application-specific data migration mechanisms. Volley's placement algorithm is composed of three phases: 1. initial placement computation using weighted spherical mean; 2. latency reduction via iterative data movement using spherical coordinates and spring modeling; 3. iterative data collapse to data centers with data center capacity and skew constraint awareness. In the evaluation section, Volley is compared to three other migration heuristics: commonIP, oneDC, and hash; Volley is shown to outperform these heuristics in perceived user latency. Pros, Cons, Comments, and Questions: 1. Volley seems more well-grounded than the other data placement heuristics compared, but the authors leave out any formal theoretical analysis of Volley. 2. Volley focuses (and shines) on reducing user perceived latency, but this isn't the only criteria that a user might be interested in optimizing. Increasing bandwidth or throughput to users might also be useful. 3. It was not clear to me if Volley was flexible enough to specify weights for the relative importance of clients' optimization criteria. From: lewis.tseng.taiwan.uiuc SpamElide on behalf of Lewis Tseng [ltseng3 SpamElide] Sent: Thursday, March 10, 2011 1:58 AM To: indy SpamElide Subject: 525 review 03/10 CS 525 - Review: 03/10 Geo-Distribution Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance, H. Weatherspoon et al, FAST 2009 The paper address the unsolved issue on trade-off between safety and performance in securing data by replicate it at geographically remote datacenter. There were three mirroring solutions had been proposed: synchronous (remote-sync), semi-synchronous, and fully asynchronous (the later two are called local-sync), which are ordered from highest to lowest latency and safety. The paper proposed a solution lying in the middle of the spectrum, i.e., it has a latency similar to local-sync systems, but with comparable reliability (though not as secure as remote-sync one). The system is called Smoke and Mirrors File Systems (SMFS). The paper classified SMFS as network-sync, because it stores substantial information in the network and use coding for remote site being able to recover the loss data due to some disaster hitting the primary site. The first contribution of SMFS is its novelty of injecting error recovering data into network proactively even when the main site does not receive any negative acknowledgement. In this way, the main site can save a lot of time, since even if it does not wait for acknowledgement from remote site, it can expect remote site to be able to recover from data loss under most circumstances. The second contribution is to identify that distributed log-structure file system is almost ideal for file mirroring due to its advantages such as optimization for write performance. Moreover, SMFS does not need to worry about the biggest disadvantage of log-structure file system, segment cleaning costs, because the paper assume that the those important data to be mirrored is known in advance and contains moderate overhead, and thus SMFS should have enough capacity. Comments/questions/critiques: Though the latency of SMFS seems to be fine, what would happen when the scheme of injecting error recovery data into network is widely adopted? Wouldn’t the congestion of network significantly affect the performance? Since there are lots of overhead in the “smoke.” Moreover, most such data are useless when the network is relatively stable. Thus, I am not fully convinced that exploit network buffer for safety is a sound design choice. Since right now many international corporations have many sites globally, it is desired to ask whether SMFS can seamlessly be adapted into multiple main/backup sites case. It is unclear which site should store which mirroring data from where. Would a more novel replication scheme based on higher-level of RAID be helpful? In other words, no notion of main and backup site and each site store some segment of (possibly coded data segmented). Moreover, when the system consider locality of multiple sites, the problems become harder. Volley: Automated Data Placement for Geo-Distributed Cloud Services, S. Agarwal et al, NSDI 2010 As many cloud-based applications gain its global usages, the paper identified the need of automatic and dynamic placement of data across geo-distributed datacenter. The paper proposed a placement algorithm called Volley and an application-specific data migration mechanism that consider datacenter capacity limits, user mobility, shared data, data inter-dependencies and locality and try to optimize the placement in terms of latency and cost based on an iterative optimization algorithm. The first contribution of the paper is the analysis of data trace of two commercial cloud-based applications, Live Mesh and Live Messenger. Especially, geographically distant data sharing and data inter-dependencies are two important things to bed considered. Due to these two observations, locality is not as simple as placing the data to the nearest users. The second contribution is the iterative optimization algorithm based on weighted spherical means, Volley. It is definitely useful to have such an algorithm that can scale well to the size of huge cloud-based applications. Though the paper also mentioned that there is no theoretical analysis on how good the optimization Volley can get, Volley seems to be relatively much better than other simple heuristics. One other minor advantage of the system is that it explicitly separate Volley from performing data migration. With this extra layer, Volley becomes more flexible and can be used by different datacenter applications. Comments/questions/critiques: Data placement seems to be a very hard optimization problem, so it might be OK if there is not yet any theoretical analysis. However, it will be interesting to see how quick Volley converges. The paper only mentioned that with a “good” initial placement, Volley can converge quickly. But the whole system also gives each datacenter/administrators much freedom to specify their policy, constraints, replication scheme, how would this impact the convergence? The paper also mentioned that datacenter can also have the freedom of using any other migration technique at the same time. I am wondering whether in this case, Volley would be as effective as in the case only Volley is used. It seems that Volley needs to take all the input from the start of the first phase, which might not be suitable in the very dynamic environment. Online version of the algorithm would be useful. Also, usually iterative algorithm has certain amount of self-healing property, which might also be useful in dynamic environment. It would be interesting to see some analysis. From: Qingxi Li [cs.qingxi.li SpamElide] Sent: Thursday, March 10, 2011 1:51 AM To: Gupta, Indranil Subject: 525 review 03/10 Smoke and Mirrors: Reflecting Files at a Geographically Remote Location without Loss of Performance This paper introduces network_sync the mechanism to mirror the whole file system remote. As the existing of large-scale disasters, for the security of the data, most data center will copy their data in a site far away in geographical. There are three kinds of mirroring solutions: 1. Synchronous mirroring: blocking applications until data is safely mirrored. 2. Semi-synchronous mirroring: as soon as the data is written, send update to the remote site. But if the disaster happens at this time and the update packet is lost in the networks, the data will be lost. 3. Asynchronous mirroring: periodically send batch of updates to the remote site, if the disaster happen between two updates, all the data updated between this period will be lost. Network_sync uses the semi-synchronous mirroring solution and send an error recovery packet together with the update data to make the whole mechanism more reliable. After the error recovery packet being sent, the router will give an acknowledgement to the site and the site will tell the user to send the next update. Besides this, network-sync uses a distributed log-structured file system with an append-only format to make all the updates in order. I was impressed by the append-only log structure, as it can only write at the end of the whole file so the updates will be automatically put into order. Besides this, the error recovery packet also can be lost in the transmission. If the author just wants to send two messages to increase the reliability. Why not just send the packet twice? Volley: Automated Data Placement for Geo-Distributed Cloud Services The paper introduces a automated data placement to calculate the optimal position for putting data of the distributed data center. Volley analyzes the logs of the datacenter requests and uses an iterative optimization algorithm to find out the optimal method to migrate data across those datacenters. For immigrating data, there are many things should be considered, like the capacity of the datacenter, the bandwidth, user’s location and relative data’s location. Volley first maps data items directly accessed by users by the weighted average of the geographic coordinates for the IP address. And for the other data items which are not directly accessed by users, Volley map it to the weighted spherical mean of the data items which communicate with them. After that, Volley iteratively moves the data items closer to both clients and relative data items by a weighted spring model which means that the distance between them and the number of communications of them will increase the force to pull them together. After that, all the data will be put into the closest data center and for the datacenter outage of capacity, the fewest accessed data items will be move to the next closer data center. However, there are some problems of Volley. First, if the request location changed frequently or the changing period is different from the period of the calculation. Then the optimal location calculated may not be the real optimal location for the future requests. The other problem is that the author uses the geographic location to present the distance. But from my last semester course project, I found the geographic distance between two nodes sometimes can be very different from the latency. When we test the RTT between UIUC of two nodes in Berkeley, one’s RTT is around 200ms and the other even can be 1800ms and for the nodes located in HK, the RTT is 190ms and for nodes located in Shanghai which is near the HK, the RTT is around 270ms. So maybe the author can find a new algorithm directly consider about the RTT time instead of the geographic location. From: yanenli SpamElide on behalf of Yanen Li [yanenli2 SpamElide] Sent: Thursday, March 10, 2011 1:02 AM To: Gupta, Indranil Subject: cs 525 review 03/10 cs 525 review 03/10 Paper 1: Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance summary: This paper discuss the Smoke and Mirrors File System (SMFS), which provides backup service of files at geographically distributed areas. Existing mirroring solutions has trade-off between performance and reliability. Synchronous mirroring solutions are very safe, but extremely sensitive to link latency. Asynchronous mirroring solutions have better performance, but we will lose data which is not synchronized. In contrast, SMFS provides a semi synchronous mirroring between different sites which strives to achieve better performance at the cost of some risk of data loss. SMFS introduces a new mirroring technique called network-sync, which adds redundancy at the network level to transmitted data, and gives back feedback for the redundancy added data. The SMFS system is evaluated against several levels of synchronizations such as local-sync, remote-sync, network-sync, local-sync+FEC and remote-sync+FEC. Pros: 1. A good solution for satisfying both performance and safety 2. Among the first to utilize the network buffers as reliable storage for transient data 3. Network-Sync is a general technique that can be applied to other places cons: 1. heavily depends on the network stability, which is not guaranteed, e.g. network is under congestion. 2. The wide area network has long round-trip-time latency, which is not desirable in the cloud-computing era 3. the distributed algorithms is not described in detail. Paper 2: Volley: Automated Data Placement for Geo-Distributed Cloud Services Summary: Volley is first work to discuss heuristic strategies for data placement strategies for geo-distributed datacenters in WANs. Datacenters today are distributed across the globe, but they still need to share data with other datacenters as well as their clients. Besides least distance, there are several additional constraints to be considered, including business constraints, WAN bandwidth costs, datacenter capacity limits, data interdependencies, user-perceived latency, etc. For heuristic is to collocate data that are tightly-coupled, another important heuristics is that there can be significant benefits to placing data closest to those who use it most heavily, rather than just placing it close to some particular client. Volley takes as input the request logs for data in the system, analyzes them, and outputs the results on where to best place the data. Volley is an iterative algorithm. 1) Volley computes an initial placement based on client IPs. 2), Volley iteratively moves data to reduce latency. 3), Volley iteratively map data items into datacenters taking into account the datacenter capacities. The live mesh and live messenger traces are used to evaluate Volley. Volley is compared with the commonIP protocol, hashing protocol, and oneDataCenter. And among these protocols Volley achieves the lowest latency. Pros: 1. There are some previous works on data placement in LANs and WSNs, however Volley is the first to address the data placement strategies for WANs. 2. The heuristics make a lot of sense, and go beyond simple closest distance intuition Cons: 1. it is not clear whether Live messenger and Mesh represent common patterns of cloud data usage, how about the search usage pattern? 2. the strategies are simple, more complex strategies might achieve much lower latency 3. the evaluation is only focusing on latency, how about throughput? -- Yanen Li Ph.D Candidate Department of Computer Science University of Illinois at Urbana-Champaign Tel: (217)-766-0686 Email: yanenli2 SpamElide From: mark overholt [overholt.mark SpamElide] Sent: Wednesday, March 09, 2011 8:25 PM To: Gupta, Indranil Subject: 525 review 03/10 Mark Overholt CS525 03/10/2011 Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Summary: This paper presents a technique to mirror files at remote locations without significant loss of performance and an increased tolerance to a total failure at the primary storage site. To achieve this goal, the authors presented a syncing method called network sync. Network sync is close to semi-synchronous style of mirroring. After the primary site performs some operation on its file system, an egress router forwards the IP packets, sends some error correction packets and sends a feedback to notify the primary site storage system about the packets that are already in transit. The client at the primary location can then continue with its job. Thus, rather than waiting for an acknowledgement from a remote site, it delays only until it receives feedback from an underlying communication layer, acknowledging that data and repair packets have been placed on the external wide area network. In effect, the design visualizes the wide area network between the remote site and the primary site as a storage system which temporarily stores data. They call this redundancy adding and feedback mechanism Maelstrom. Maelstrom generates error collection packet based on Forward Error Correction (FEC) technique, and uses callbacks to increase performance. According to the experiment results, SMFS shows good data safety without much loss of performance. Discussion: Pros: A good use of efficient techniques to achieve an intermediate solution between synchronous and semi-synchronous solutions. This could be used for a large subset of data transferred. The failure model considered for the study is quite comprehensive Cons: The scheme does not guarantee reliability as it depends upon network stability which is highly unreliable. Also FEC is unable to recover data in certain cases when most of the data is lost or is corrupted. TCP retransmissions can fail due to failure of source node. It’s not clear which of their two design choices gives a better reliability versus performance tradeoff. The Wide area network has long round-trip-time latency. RACS: A Case for Cloud Storage Diversity Summary: A Redundant Array of Cloud Storage (RACS) is a cloud storage solution that attempts to reduce the likely hood of data loss and the cost of switch data providers. The authors propose a system where data is striped and contains parity data, similar to RAID 5. While RAID is at the disk level, RACS is at the cloud level. Instead of striping data across multiple disks, RACS stripes your data across multiple cloud providers. The main reason they cite for using this system is the idea of Data Inertia. Because cloud providers charge you based on the bandwidth in and out of their system, it makes moving large amounts of data from one provider (costing you out-bandwidth costs) to another (costing you in-bandwidth costs). Data inertia says that the more data you have stored with a cloud provider, the more it will cost to move that data. So if a company is storing large amounts of data, and that provider either is being a bad provider, or a better provider appears on the market…the financial burden of moving the data would be too immense to take advantage of the better systems. RACS is a proxy system that can help alleviate some problems with provider outages and data inertia. RACS provides the same interface as Amazon S3, which makes it very easy to add RACS in between an already existing S3 client application. RACS uses an idea of erasure encoding to break data into small chunks, and then stores those chunks in different locations. The data has parity built in, so you only need a subset of the chunks to rebuild your data. Using this method, if a storage provider goes down, you can still rebuild your data from some of the other providers that you have erasure chunks on. Also, if a new provider comes along that you want to use, of If one of your providers is not performing well, you can either just not use the data stored with that provider, or you can move the data. And since you are only storing a small fraction of the total data size, the move will be far less costly. Discussion: Pros: I think this is a fantastic design. With minimal overhead, and a transparent proxy, you get super redundant data with the ability to move data at much lower costs. And the chance of >n storage providers all going down at once is VERY small (where n is the number of erasure chunks needed to rebuild the data) Took into account that the RACS proxy could be a bottle neck, and hence, made it a distributed proxy. Cons: The need to maintain many different cloud storage accounts could become cumbersome. Also RACS would have to know all of these account details in order to properly place the files, based on a best cost analysis of which accounts you have, and their prices. From: Nipun Sehrawat [sehrawa2 SpamElide] Sent: Wednesday, March 09, 2011 7:59 PM To: Gupta, Indranil Subject: 525 review 03/10 RACS: A Case Study for Cloud Storage Diversity This paper presents the RAID ideas in a novel context of cloud storage solutions, to avoid vendor lock-ins and tolerate provider failures/outages, while incurring a minimal extra cost of these added benefits. Vendor lock-in a significant challenge that might prohibit people from moving their large data to a cloud - most cloud providers charge both incoming and outgoing network fees, making it very cost to make a transition from one provider to another (resulting in "data inertia"). Outages, though rare (SLAs provide strict guarantees and cloud provider maintains multiple replica), can lead to permanent data loss. Economic failures of a cloud provider might result in all the users migrating to other providers, thus incurring high transfer costs. The key idea presented in this paper is to, as done in RAID, do a "block-level" stripping of data among several cloud providers and maintain some redundancy, instead of storing the entire data with one provider. As the data is distributed among multiple providers, it is easy to shun a high-charging provider for better price, because the amount of data transferred will be less. The added redundancy helps in overcoming outages (datacenter failures, network partitions etc) to avoid data loss and provide high availability. RACS uses erasure-coding to map m small fragments of a data-object to n fragments (n > m), such that the original data-object can be recovered from any m of these n fragments. Thus, the amount of storage required in RACS set up is n/m times the size of data. n and m can be chosen by analyzing provider SLA and importance of data to be stored. RACS is implemented as a proxy between applications and storage servers and is compatible with Amazon S3’s interface, so unmodified applications can use RACS transparently. To avoid RACS itself being a bottleneck while communicating with the storage servers, multiple instances of RACS proxies sit between clients and storage servers. The instances are synchronized using ZooKeeper, to avoid race conditions occurring when different client work with a common piece of data but via different RACS proxies. While retrieving data, only m out of the n fragments of a data-object are to be retrieved, various policies like sending request to all n servers and recreating data from the first m replies can be used, depending on latency and economic requirements. Pros: - RACS circumvents the problem of vendor lock-in, thus overcoming a major concern of moving to cloud for storage. - Over and top of all the replication done by cloud providers, RACS provide an extra level of redundancy, to make cloud a more reliable platform for storing important data. Cons: - Erasure encoding is computation intensive and may become bottleneck in presence of high speed network connection. Suggestions/Thoughts: - m and n seem to be static parameters, user may want to later change these to strike a balance between redundancy, extra cost etc. I wonder if it is possible to allow some limited changes to m and n - we can re-encode on demand - if a data-object is to be updated (read and then written), we can delete previous replicas of the object and encode it with fresh values of m and n and upload again (Thus we don’t incur any additional network cost, we were anyways going to do a get() and put() for this update.) --- Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance The central idea of this paper is to provide a mirroring solution, called network-sync, that strikes a balance between the loss of performance (in terms of latency seen by end-user, before completion of an operation) and the amount of data that can be lost in case of a disaster. On the two extremes of mirroring and synchronous mirroring, where the concerned application blocks until the update is applied at both the primary and the remote site, and asynchronous mirroring, where updates are only applied locally and are sent to the remote site periodically, in batches. These define two extremes of performance (asynchronous better) and data-loss-prevention (synchronous). Network-sync works on a simple idea that once the update packets have been offloaded in the network, the network (dedicated optical fiber links) can be considered as a temporary disk for storing this data. Thus, even if the primary site fails and the packets haven’t reached the mirror site, the in-flight data is considered to be safe in the ‘network disk’. However, the network might loose some packets, thus making this in-flight data prone to loss & corruption. To tackle this, proactive recovery mechanism is used and some additional error correction packets are sent along with the original data. This serves two purposes - if the original data is corrupted, corrections can be done instantly, without waiting for re-transmission from the primary site (as in case of simple TCP). It also tackles the case when the primary site fails just after offloading the update messages on the network and thus won’t be able to resend corrupted packets. In network sync, redundancy is achieved by using a Forward Error Correcting (FEC) technique and applications are unblocked as soon as the update messages are offloaded to the temporary “network disk”. This is achieved by having call-backs after sending all the error correction packets. Pros: - Provides a novel mirroring approach where the network is considered as a (somewhat) reliable disk, temporarily holding the data until it reaches the mirror site. This temporary disk is made more reliable by proactively sending error correction packets and using dedicated optical fiber links. Cons: - The paper discusses very little about the SMFS file system and focuses more on the network-synch part (Maelstorm), which I believe is a previous work from the same authors. From: Tengfei Mu [tengfeimu SpamElide] Sent: Wednesday, March 09, 2011 5:33 PM To: Gupta, Indranil Subject: 525 review 03/10 1. RACS: A case for cloud storage diversity This paper proposes RACS: a proxy that could transparently striping user data across multiple cloud storage provider, so that the customers could avoid the vendor lock-in and achieve better fault tolerance for their own data. The authors are motivated by improving reliability and economic marketing problem for cloud customers. The key idea is based on RAID, especially level 5, which is not novel. But in order to be practical in the cloud arena, they make RACS distributive. And customers could also adjust the trade-off between the migrating data overhead expenses and the vendors’ mobility. Then they build a prototype of RACS, which is compatible with existing Amazon S3 clients and uses some other multiple storage providers as backup. Pros: 1. RAID is a good and approved approach for reliability and economic failure. The paper choose a good scenario, cloud, for applying the RAID. 2. RACS proxy is distributed, which could make the whole system more scalable. Cons: 1. RACS introduce overhead actually not necessary in short terms within the cloud storage market 2. Smoke and Mirrors: Reflecting files at a geographically remote location without loss of performance The paper proposes the Smoke and Mirrors File System (SMFS), which aims to reflect files at a geographically remote location, by utilizing a novel network-synch mirroring approach. While retaining the performance of the existing semi-synchronous and fully asynchronous mirroring solutions, the network-synch approach could provide stronger reliability. Specifically for SMFS, the primary site storage system not only applies a request locally, but also forwards it towards the remote site location simultaneously. Then the network-synch sends an acknowledgment to the primary storage system. After this, the primary storage system move to its next operation. After remote location storage system receiving the request, it applies the request to local storage and sends acknowledgments back to primary site. Also, the authors evaluated their network-synch remote mirroring prototype on the Cornell NLR Rings testbed. Pro: 1. SMFS improves the tradeoff problem between performance and reliability. It improves performance in comparison to synchronous replication, and improves reliability in comparison to semi-synchronous solution. Con: 1. The paper focus on high-bandwidth wider-area networks. However, in reality, it is more like to have low-bandwidth and high latency wide area network. From: Ankit Singla [iitb.ankit SpamElide] Sent: Wednesday, March 09, 2011 1:21 PM To: Gupta, Indranil Subject: 525 review 03/10 1. Smoke and Mirrors (SMFS) ------------------------------------------- Summary: The paper targets a new design point in the trade-off between data safety with replication and performance overheads incurred due to this replication. Synchronous mirroring provides the strongest guarantees on data safety, but its performance is tightly coupled with latency between remote sites because operations wait for the data to have been replicated. Asynchronous mirroring's performance is independent of latency, but there is significant risk of data loss. The paper targets a space between these extremes by exploiting network-level redundancy they call 'network-sync'. In addition, they provide a 'group consistency' primitive: a group of updates can be reflected at a mirror site atomically in an all or none fashion. Their idea is fairly simple: use forward error correction to make the network more reliable; once data is in flight in the network towards the remote mirror, assume that the network will deliver it and send the client the acknowledgement. They maintain state at the primary site which waits on the remote site to acknowledge an update, before garbage collecting this state. Comments: I like the simplicity of their idea. The discussion of using the criticality of each update to decide what safety guarantees to provide is also very fruitful. They do not fully explore that angle though. I would expect that you could classify various updates for an application into a few distinct classes, each associated with its own safety guarantees which translate into various configurations for syncing (synchronous mirror for the most critical etc.) The use of log-based file systems seems to be becoming popular as storage becomes cheap and worries about defragmentation/cleaning become less significant for many applications. 2. RACS -------------- Summary: The paper explores the simple idea of applying RAID techniques at the cloud level instead of a local storage system. While like RAID this adds fault tolerance, in the cloud environment it also brings the added advantage of preventing provider lock-in. It seems the rest of this system could be built easily on top of something like Cumulus (which incidentally appeared in the same edition of FAST in 2009). While most of their ideas are very close to RAID, the heterogeneous environment RACS applies to prompts a look into policy for deciding which provider to use for how much data (based on cost, reliability, latency etc.) They do discuss this briefly (policy-hints). Comments: I'm a little skeptical about how significant provider lock-in will be in the near future. A large number of big firms are working to ensure data portability. It is also not unreasonable to expect antitrust legislation in this direction with standard norms for data export. This part of the motivation may become a non-issue fairly soon. The data-inertia argument is still fairly significant and the paper makes a good case for monetary savings in this aspect. The catastrophic failure of a particular cloud provider is also a strong argument in favor of the paper's motive. In the light of the other paper (Smoke and Mirrors) I wonder how the added cost of a put operation affects things. Does the client have to wait for an interval which is the maximum of the latencies to all providers so it is certain that an update has been applied? Ankit -------------------------------------------------------------------- Ankit Singla Graduate Student, Computer Science University of Illinois at Urbana-Champaign (UIUC) http://www.cs.illinois.edu/homes/singla2/ From: Tony Huang [tonyh1986 SpamElide] Sent: Saturday, March 05, 2011 3:54 PM To: Gupta, Indranil Cc: Tony Huang Subject: 525 review 03/10 Zu C Huang zuhuang1 * Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance Core Idea: In this paper, the author introduces an new consistency model called network-sync. Thet network-sync is not as fault tolerant as the remote-sync, which is synchronized and strongly consisteny, meaning that the user will only get a response if the remote site acknowledges receiving the replicate. It is, however, more fault tolerant than local-sync, which returns immediate after the data are committed to local disk. The main idea is to use the network as another mean of data consistency. It adds redundancy to the network transmission level. The data sent is now sent with forward error correction packages. Feedback loops is not added to allow eng nodes to identify errors. Pros: * Convenient incremental update that improves the fault tolerant level of current system. * Conceptually simple. Cons: * Will this design increases the traffic during the time when the datacenter is experiencing network outages. * Lack a good theorerical discussion of how much improvement this system make to prevent data loss. * Volley: Automated Data Placement for Geo-Distributed Cloud Services. Core Idea: The volley is system designed to automatically place a service across differnet geographical location in resposne to users geo-location pattern. The geographical distributions of user is determined from their ip addresses. The paper first brings out some interesting study about the current user pattern for online services. It shows that a lot of data is shared between multiple users located across multiple geographic locations. Data-interdependence further complicate this problem. Application change could potential change significantly change the user pattern, and there are users with very high mobility. For volley, it tries to minimize the geo-distance between a service and its users. After necessary information are collected, Volley runs an location placement algorithm that determines the best data center location for jobs. The network operators can move the jobs according to the suggestion made by volley. Volley's use a spring model to model the relation between a service and a user. Pros: * Provide an automatic and theoretical rigorous way to define geo-distance between user and services. * Use mature model and optimization algorithm to derive a sound algorithm for computing job placement location. Cons: * Closer geo-location does not mean better latency or performence as heterogeneity of data center, network condition or maybe * Did not consider the bandwidth consumed by moving services across data centers. If the service being moved is huge in size, such as youtube videos, this would induce huge background traffic and waste significant amount of bandwidth. * Will oscillation become a serious problem when deployed to applications that are popular across the global? * Should consider the option of simply starting another instance of service instead of moving the current instance. -- Regards -- Tony