April 10, 2014 - by Jerry Jelinek
When we announced Manta last summer we knew that one of the first new features we'd like to add was a way to access the objects in Manta in the same simple way as you access local files. Today we're excited to announce that this new capability is ready for general use.
When we set out to build this new feature, it was pretty clear to us that the best way to do this was to build a NFS gateway into Manta. NFS is a good choice because its a mature, simple and standardized file system protocol. Because of this, its a core feature in many different operating systems and certainly the ones that Manta users are most likely to be running, such as SmartOS, Linux, MacOS, or Windows.
Thus, we built a user-level NFS server that runs on any local system, services NFS requests from either the local or remote systems, and gateways those requests back into Manta. Once the server is running, you can see how easy it is to mount your Manta object store onto the local file system and then work directly with your objects in Manta as if they were local files.
% sudo mount 127.0.0.1:/foo/stor /mnt % ls /mnt data project.txt spec.pdf plans.txt roadmap.pdf % cp ~/results.txt /mnt/data
Although the Manta CLI tools are easy to use, this is even easier, since all of the native commands for manipulating files are directly available. If you are running on an OS which provides a graphical tool for managing the file system, such as the MacOS finder, then that works too.
While the gateway provides easy access to the object storage side of Manta, it cannot expose any of the integrated compute capabilities which are one of Manta's most compelling features. For that, the Manta CLI must still be used. However, the gateway does make it easier to get data into Manta for later use in compute jobs.
Within Manta data is stored as atomic objects. Unlike files, there is no such thing as a partial update to an existing object. However, NFS, by definition, is a file-centric model which exposes operations that are commonly performed on files. To enable these two different models to work together, the NFS server implements a local, internal cache which provides the normal file operations on cached Manta objects. The cache implements a write-back model in which dirty files are periodically written to Manta as objects.
Dave Pacheco, one of the architects of Manta, has written an interesting discussion about the CAP tradeoffs involved in Manta. Due to the internal cache, the NFS server exposes different choices since there is no longer a consistent view of the objects. However, aside from enabling file system semantics for Manta, the cache also provides increased performance and availability as compared to accessing the remote data.
We have barely scratched the surface of how we might use the NFS gateway internally but there are already several interesting examples we've come up with. Saving log data into Manta is one case. By NFS mounting the log directory, all logs are automatically saved into Manta with no special action needed. Another case is to use the server to speed up builds. Since we save build artifacts into Manta, the server's cache makes it fast to access these components when building the other things that depend on these pieces. We also use rsync to copy the results of our pkgsrc builds into Manta. We'll undoubtedly find many additional uses as time goes on.
To get started using the NFS gateway, see the documentation here.