Process and Transform Data at Rest
Compute processing is available on demand, on data at rest in Triton Object Storage. No movement of data over the network for processing is required.
MapReduce and ETL functions are provided as Unix calls, so there is no analytics framework to learn. Many scripting languages are supported for easy customization.
Run one or thousands of compute jobs (short or long running) to process data at any scale, without having to provision compute instances. Pay only for what you use.
Leverage pre-provisioned containers to run analytics jobs directly on data in Triton Object Storage. No need to deploy servers and software, or to copy data to compute nodes for processing.
No need to learn MapReduce or ETL frameworks. Analytics jobs support Unix scripts in shell, R, awk, grep, Python, node.js, Perl, Ruby, Java, C/C++, and ffmpeg.
Debug multi-phase maps and reducers. An interactive shell inside an analytics job can run directly on the stored object to inspect it, rerun the job, and save in-flight files.
Tune MapReduce init, task types, and resource allocations. Run phased reducers with Outputs (maggr, mcat, mpipe, mtee, msplit) feeding cascading reducers — unix pipeline style.
Execute assets (scripts) in any language to perform image conversion, transcode video, generate databases from access logs, and other format conversions.
Using standard tools like NumPy, SciPy and R
MapReduce processing with arbitrary scripts and code without data transfer
Clickstream analysis, MapReduce on logs
Converting formats, generating thumbnails, resizing
Transcoding, extracting segments, resizing
Text processing including search