Glossary

This glossary defines the key terms and concepts used throughout the Eventual platform and ev SDK documentation.

Core Concepts

Authentication

Browser-based login system used to access the Eventual platform. Use ev auth login to authenticate via your browser, eliminating the need to manage API keys manually.Example: ev auth login opens browser for secure authentication

Client

Python class that provides programmatic access to the Eventual platform. Used to submit jobs, check status, and retrieve results.Example: client = Client.default()

daft

Our multimodal query engine that powers Eventual’s data processing capabilities. Designed for multimodal data processing including images, videos, audio, and text.Related: Built by the same team that created Eventual

Environment (Env)

Runtime configuration that defines the dependencies, environment variables, and files needed for a job to run successfully.Example: env = Env().pip_install(["torch", "pillow"])

Job

Fundamental unit of execution on the Eventual platform. A Python function decorated with @job.main that runs distributed processing tasks.Example: @job.main def process_data(input_path: str):

Job Handle

Reference to a running or completed job that allows you to check status, retrieve results, and monitor progress.Example: job_handle = client.run(job, args={"param": "value"})

Multimodal Data

Data that includes multiple types such as images, videos, audio, text, and structured data. Eventual specializes in processing these diverse data types together.Examples: Product catalogs with images and descriptions, video content with transcripts

Resource

Reference-able entity that can be used across jobs and shared within your organization. Includes data volumes, ML models, and external services.Example: model = Resource(name="classifier", path="s3://models/model.pkl")

Space

Logical grouping of resources for organization, access control, and isolation. Similar to projects or workspaces in other platforms.Example: “production”, “development”, “research”

Volume

Resource abstraction for data storage locations such as S3 buckets, databases, or file systems.Example: volume = Volume(name="data", path="s3://bucket/data/")

Platform Features

Auto-scaling

Automatic adjustment of compute resources based on workload demands. Jobs scale up during heavy processing and scale down when idle.Benefit: No manual resource management required

Fault Tolerance

Built-in resilience against hardware failures, network issues, and other problems. Jobs automatically retry failed operations and handle edge cases.Features: Automatic retries, checkpoint recovery, graceful degradation

Distributed Processing

Automatic distribution of work across multiple machines or cores. Your Python code runs in parallel without requiring distributed systems knowledge.Benefit: Scale from prototype to production without code changes

Job Lifecycle

Complete process from job submission to completion, including: Submit → Schedule → Execute → Monitor → CompleteVisibility: Full tracking and logging at every stage

Monitoring

Real-time tracking of job progress, resource usage, and performance metrics. Includes logging, metrics, and alerting capabilities.Tools: CLI commands, dashboard UI, programmatic access

Technical Terms

Batch Processing

Processing multiple items together in a single operation for efficiency. Common in ML inference and data transformation tasks.Example: Processing 32 images simultaneously instead of one at a time

CLI

Command Line Interface - The ev command-line tool for interacting with the Eventual platform from your terminal.Usage: ev run ./job.py, ev jobs list

Decorator

Python syntax using @ symbol to modify function behavior. The @job.main decorator turns a regular function into an Eventual job.Example: @job.main def my_function():

Dependencies

External packages required by your job, specified in the Environment configuration. Automatically installed before job execution.Example: env.pip_install(["numpy", "pandas", "torch"])

Embeddings

Numerical representations of data (text, images, audio) that capture semantic meaning. Used for similarity search and ML tasks.Use cases: Similarity search, recommendation systems, clustering

Feature Extraction

Process of extracting meaningful characteristics from raw data (images, text, audio) using ML models.Example: Extracting visual features from images using ResNet

Parquet

Columnar storage format optimized for analytical workloads. Commonly used with daft for efficient data processing.Benefits: Compression, fast queries, schema evolution

SDK

Software Development Kit - The ev-sdk Python package that provides programmatic access to the Eventual platform.Installation: pip install ev-sdk

Type Hints

Python annotations that specify parameter and return types. Required for Eventual job functions to ensure proper serialization.Example: def process(data: str, count: int) -> Dict[str, Any]:

Data Processing Terms

DataFrame

Tabular data structure used by daft for representing and processing datasets. Similar to pandas DataFrames but designed for distributed processing.Operations: Filter, group, join, transform, aggregate

ETL

Extract, Transform, Load - Pattern for data processing pipelines. Eventual jobs commonly implement ETL workflows for multimodal data.Stages: Extract from sources, Transform with ML models, Load to destinations

Pipeline

Series of connected processing steps where the output of one step becomes the input of the next. Common pattern in data processing and ML workflows.Example: Data ingestion → Feature extraction → Model inference → Result storage

Query Engine

System for executing queries against datasets. daft is the query engine that powers Eventual’s data processing capabilities.Features: SQL-like operations, distributed execution, multimodal data support

Serialization

Process of converting data structures into a format that can be stored or transmitted. Eventual automatically handles serialization of job parameters and results.Formats: JSON, pickle, parquet

Schema

Structure definition for data, including column names, types, and constraints. Important for data validation and processing optimization.Example: {"name": "string", "age": "integer", "image_path": "string"}

Machine Learning Terms

Inference

Process of using a trained model to make predictions on new data. Common use case for Eventual jobs processing large datasets.Example: Classifying millions of images with a pre-trained model

Model

Trained machine learning algorithm that can make predictions or extract features from data. Often stored as a Resource in Eventual.Types: Classification, regression, feature extraction, generative

Pre-trained Model

Model that has already been trained on a large dataset and can be used for inference or fine-tuning. Commonly used in Eventual jobs.Examples: ResNet for images, BERT for text, Whisper for audio

Tensor

Multi-dimensional array used to represent data in ML frameworks like PyTorch and TensorFlow. Common data structure in Eventual ML jobs.Dimensions: 1D (vector), 2D (matrix), 3D (image), 4D (batch of images)

Training

Process of teaching a model to make predictions by showing it examples. While Eventual can run training jobs, it’s more commonly used for inference.Output: Trained model weights and parameters

Validation

Process of evaluating model performance on test data. Important step before deploying models in production Eventual jobs.Metrics: Accuracy, precision, recall, F1-score

Infrastructure Terms

Cluster

Group of connected computers that work together to process jobs. Eventual automatically manages clusters for you.Benefits: Parallel processing, fault tolerance, resource pooling

Compute

Processing resources (CPU, GPU, memory) used to run jobs. Eventual automatically allocates appropriate compute based on job requirements.Types: CPU instances, GPU instances, memory-optimized instances

Container

Lightweight, portable execution environment that packages your code and dependencies. Eventual uses containers to run jobs consistently.Benefits: Isolation, reproducibility, portability

Load Balancing

Distribution of work across multiple machines to optimize performance and resource usage. Handled automatically by Eventual.Result: Efficient resource utilization, improved performance

Scaling

Adjusting compute resources based on demand. Eventual scales your jobs up during heavy processing and down during idle periods.Types: Horizontal (more machines), Vertical (more powerful machines)

Storage

Data persistence layer for inputs, outputs, and intermediate results. Commonly uses cloud storage like S3.Access patterns: Sequential reads, random access, batch writes

Status and States

Completed

Job has finished successfully and produced results. Final state for successful job execution.Next steps: Retrieve results, analyze output, run dependent jobs

Failed

Job encountered an error and could not complete successfully. Requires investigation and potentially rerunning.Actions: Check logs, fix issues, retry job

Pending

Job is waiting to be scheduled for execution. May be waiting for resources or dependencies.Reason: Resource availability, queue position, dependency requirements

Running

Job is currently executing on the platform. Active processing state with ongoing resource usage.Monitoring: Check logs, monitor progress, track resource usage

Scheduled

Job has been accepted and is waiting for execution resources. Transition state between submission and running.Duration: Depends on resource availability and job priority

Submitted

Job has been sent to the platform for execution. Initial state after job submission.Next state: Usually transitions to “scheduled” or “pending”

Common Abbreviations

API

Application Programming Interface - Set of protocols and tools for building software applications.

AWS

Amazon Web Services - Cloud computing platform commonly used for storage and compute resources.

CLI

Command Line Interface - Text-based interface for interacting with software.

GPU

Graphics Processing Unit - Specialized processor for parallel computing, commonly used in ML tasks.

HTTP

Hypertext Transfer Protocol - Protocol for transferring data over the internet.

JSON

JavaScript Object Notation - Lightweight data interchange format.

Machine Learning - Field of AI focused on building systems that learn from data.

Simple Storage Service - Amazon’s object storage service, commonly used for data storage.

SDK

Software Development Kit - Collection of tools for building applications.

URL

Uniform Resource Locator - Web address for accessing resources.

Related Technologies

Docker

Containerization platform used by Eventual to package and run jobs in isolated environments.

Kubernetes

Container orchestration platform used by Eventual to manage and scale containerized jobs.

Pandas

Python data analysis library similar to daft but designed for single-machine processing.

PyTorch

Machine learning framework commonly used in Eventual jobs for model inference and training.

Ray

Distributed computing framework similar to Eventual but requires more distributed systems knowledge.

Spark

Big data processing engine that Eventual outperforms for multimodal workloads.

TensorFlow

Machine learning framework alternative to PyTorch, also supported in Eventual jobs.

This glossary is continuously updated as new features and concepts are added to the Eventual platform. If you encounter a term not defined here, please reach out to support@eventualcomputing.com.

Overview

Support

Core Concepts

Platform Features

Technical Terms

Data Processing Terms

Machine Learning Terms

Infrastructure Terms

Status and States

Common Abbreviations

Overview

Support

​Core Concepts

​Platform Features

​Technical Terms

​Data Processing Terms

​Machine Learning Terms

​Infrastructure Terms

​Status and States

​Common Abbreviations

​Related Technologies

Core Concepts

Platform Features

Technical Terms

Data Processing Terms

Machine Learning Terms

Infrastructure Terms

Status and States

Common Abbreviations

Related Technologies