Core Concepts
Authentication
Authentication
Browser-based login system used to access the Eventual platform. Use
ev auth login
to authenticate via your browser, eliminating the need to manage API keys manually.Example: ev auth login
opens browser for secure authenticationClient
Client
Python class that provides programmatic access to the Eventual platform. Used to submit jobs, check status, and retrieve results.Example:
client = Client.default()
daft
daft
Our multimodal query engine that powers Eventual’s data processing capabilities. Designed for multimodal data processing including images, videos, audio, and text.Related: Built by the same team that created Eventual
Environment (Env)
Environment (Env)
Runtime configuration that defines the dependencies, environment variables, and files needed for a job to run successfully.Example:
env = Env().pip_install(["torch", "pillow"])
Job
Job
Fundamental unit of execution on the Eventual platform. A Python function decorated with
@job.main
that runs distributed processing tasks.Example: @job.main def process_data(input_path: str):
Job Handle
Job Handle
Reference to a running or completed job that allows you to check status, retrieve results, and monitor progress.Example:
job_handle = client.run(job, args={"param": "value"})
Multimodal Data
Multimodal Data
Data that includes multiple types such as images, videos, audio, text, and structured data. Eventual specializes in processing these diverse data types together.Examples: Product catalogs with images and descriptions, video content with transcripts
Resource
Resource
Reference-able entity that can be used across jobs and shared within your organization. Includes data volumes, ML models, and external services.Example:
model = Resource(name="classifier", path="s3://models/model.pkl")
Space
Space
Logical grouping of resources for organization, access control, and isolation. Similar to projects or workspaces in other platforms.Example: “production”, “development”, “research”
Volume
Volume
Resource abstraction for data storage locations such as S3 buckets, databases, or file systems.Example:
volume = Volume(name="data", path="s3://bucket/data/")
Platform Features
Auto-scaling
Auto-scaling
Automatic adjustment of compute resources based on workload demands. Jobs scale up during heavy processing and scale down when idle.Benefit: No manual resource management required
Fault Tolerance
Fault Tolerance
Built-in resilience against hardware failures, network issues, and other problems. Jobs automatically retry failed operations and handle edge cases.Features: Automatic retries, checkpoint recovery, graceful degradation
Distributed Processing
Distributed Processing
Automatic distribution of work across multiple machines or cores. Your Python code runs in parallel without requiring distributed systems knowledge.Benefit: Scale from prototype to production without code changes
Job Lifecycle
Job Lifecycle
Complete process from job submission to completion, including: Submit → Schedule → Execute → Monitor → CompleteVisibility: Full tracking and logging at every stage
Monitoring
Monitoring
Real-time tracking of job progress, resource usage, and performance metrics. Includes logging, metrics, and alerting capabilities.Tools: CLI commands, dashboard UI, programmatic access
Technical Terms
Batch Processing
Batch Processing
Processing multiple items together in a single operation for efficiency. Common in ML inference and data transformation tasks.Example: Processing 32 images simultaneously instead of one at a time
CLI
CLI
Command Line Interface - The
ev
command-line tool for interacting with the Eventual platform from your terminal.Usage: ev run ./job.py
, ev jobs list
Decorator
Decorator
Python syntax using
@
symbol to modify function behavior. The @job.main
decorator turns a regular function into an Eventual job.Example: @job.main def my_function():
Dependencies
Dependencies
External packages required by your job, specified in the Environment configuration. Automatically installed before job execution.Example:
env.pip_install(["numpy", "pandas", "torch"])
Embeddings
Embeddings
Numerical representations of data (text, images, audio) that capture semantic meaning. Used for similarity search and ML tasks.Use cases: Similarity search, recommendation systems, clustering
Feature Extraction
Feature Extraction
Process of extracting meaningful characteristics from raw data (images, text, audio) using ML models.Example: Extracting visual features from images using ResNet
Parquet
Parquet
Columnar storage format optimized for analytical workloads. Commonly used with daft for efficient data processing.Benefits: Compression, fast queries, schema evolution
SDK
SDK
Software Development Kit - The
ev-sdk
Python package that provides programmatic access to the Eventual platform.Installation: pip install ev-sdk
Type Hints
Type Hints
Python annotations that specify parameter and return types. Required for Eventual job functions to ensure proper serialization.Example:
def process(data: str, count: int) -> Dict[str, Any]:
Data Processing Terms
DataFrame
DataFrame
Tabular data structure used by daft for representing and processing datasets. Similar to pandas DataFrames but designed for distributed processing.Operations: Filter, group, join, transform, aggregate
ETL
ETL
Extract, Transform, Load - Pattern for data processing pipelines. Eventual jobs commonly implement ETL workflows for multimodal data.Stages: Extract from sources, Transform with ML models, Load to destinations
Pipeline
Pipeline
Series of connected processing steps where the output of one step becomes the input of the next. Common pattern in data processing and ML workflows.Example: Data ingestion → Feature extraction → Model inference → Result storage
Query Engine
Query Engine
System for executing queries against datasets. daft is the query engine that powers Eventual’s data processing capabilities.Features: SQL-like operations, distributed execution, multimodal data support
Serialization
Serialization
Process of converting data structures into a format that can be stored or transmitted. Eventual automatically handles serialization of job parameters and results.Formats: JSON, pickle, parquet
Schema
Schema
Structure definition for data, including column names, types, and constraints. Important for data validation and processing optimization.Example:
{"name": "string", "age": "integer", "image_path": "string"}
Machine Learning Terms
Inference
Inference
Process of using a trained model to make predictions on new data. Common use case for Eventual jobs processing large datasets.Example: Classifying millions of images with a pre-trained model
Model
Model
Trained machine learning algorithm that can make predictions or extract features from data. Often stored as a Resource in Eventual.Types: Classification, regression, feature extraction, generative
Pre-trained Model
Pre-trained Model
Model that has already been trained on a large dataset and can be used for inference or fine-tuning. Commonly used in Eventual jobs.Examples: ResNet for images, BERT for text, Whisper for audio
Tensor
Tensor
Multi-dimensional array used to represent data in ML frameworks like PyTorch and TensorFlow. Common data structure in Eventual ML jobs.Dimensions: 1D (vector), 2D (matrix), 3D (image), 4D (batch of images)
Training
Training
Process of teaching a model to make predictions by showing it examples. While Eventual can run training jobs, it’s more commonly used for inference.Output: Trained model weights and parameters
Validation
Validation
Process of evaluating model performance on test data. Important step before deploying models in production Eventual jobs.Metrics: Accuracy, precision, recall, F1-score
Infrastructure Terms
Cluster
Cluster
Group of connected computers that work together to process jobs. Eventual automatically manages clusters for you.Benefits: Parallel processing, fault tolerance, resource pooling
Compute
Compute
Processing resources (CPU, GPU, memory) used to run jobs. Eventual automatically allocates appropriate compute based on job requirements.Types: CPU instances, GPU instances, memory-optimized instances
Container
Container
Lightweight, portable execution environment that packages your code and dependencies. Eventual uses containers to run jobs consistently.Benefits: Isolation, reproducibility, portability
Load Balancing
Load Balancing
Distribution of work across multiple machines to optimize performance and resource usage. Handled automatically by Eventual.Result: Efficient resource utilization, improved performance
Scaling
Scaling
Adjusting compute resources based on demand. Eventual scales your jobs up during heavy processing and down during idle periods.Types: Horizontal (more machines), Vertical (more powerful machines)
Storage
Storage
Data persistence layer for inputs, outputs, and intermediate results. Commonly uses cloud storage like S3.Access patterns: Sequential reads, random access, batch writes
Status and States
Completed
Completed
Job has finished successfully and produced results. Final state for successful job execution.Next steps: Retrieve results, analyze output, run dependent jobs
Failed
Failed
Job encountered an error and could not complete successfully. Requires investigation and potentially rerunning.Actions: Check logs, fix issues, retry job
Pending
Pending
Job is waiting to be scheduled for execution. May be waiting for resources or dependencies.Reason: Resource availability, queue position, dependency requirements
Running
Running
Job is currently executing on the platform. Active processing state with ongoing resource usage.Monitoring: Check logs, monitor progress, track resource usage
Scheduled
Scheduled
Job has been accepted and is waiting for execution resources. Transition state between submission and running.Duration: Depends on resource availability and job priority
Submitted
Submitted
Job has been sent to the platform for execution. Initial state after job submission.Next state: Usually transitions to “scheduled” or “pending”
Common Abbreviations
API
API
Application Programming Interface - Set of protocols and tools for building software applications.
AWS
AWS
Amazon Web Services - Cloud computing platform commonly used for storage and compute resources.
CLI
CLI
Command Line Interface - Text-based interface for interacting with software.
GPU
GPU
Graphics Processing Unit - Specialized processor for parallel computing, commonly used in ML tasks.
HTTP
HTTP
Hypertext Transfer Protocol - Protocol for transferring data over the internet.
JSON
JSON
JavaScript Object Notation - Lightweight data interchange format.
ML
ML
Machine Learning - Field of AI focused on building systems that learn from data.
S3
S3
Simple Storage Service - Amazon’s object storage service, commonly used for data storage.
SDK
SDK
Software Development Kit - Collection of tools for building applications.
URL
URL
Uniform Resource Locator - Web address for accessing resources.
Related Technologies
Docker
Docker
Containerization platform used by Eventual to package and run jobs in isolated environments.
Kubernetes
Kubernetes
Container orchestration platform used by Eventual to manage and scale containerized jobs.
Pandas
Pandas
Python data analysis library similar to daft but designed for single-machine processing.
PyTorch
PyTorch
Machine learning framework commonly used in Eventual jobs for model inference and training.
Ray
Ray
Distributed computing framework similar to Eventual but requires more distributed systems knowledge.
Spark
Spark
Big data processing engine that Eventual outperforms for multimodal workloads.
TensorFlow
TensorFlow
Machine learning framework alternative to PyTorch, also supported in Eventual jobs.
This glossary is continuously updated as new features and concepts are added to the Eventual platform. If you encounter a term not defined here, please reach out to support@eventualcomputing.com.