torch_geometric.profile

`profileit`	A decorator to facilitate profiling a function, e.g., obtaining training runtime and memory statistics of a specific model on a specific dataset.
`timeit`	A context decorator to facilitate timing a function, e.g., obtaining the runtime of a specific model on a specific dataset.
`get_stats_summary`	Creates a summary of collected runtime and memory statistics.
`trace_handler`
`print_time_total`
`rename_profile_file`
`torch_profile`
`count_parameters`	Given a `torch.nn.Module`, count its trainable parameters.
`get_model_size`	Given a `torch.nn.Module`, get its actual disk size in bytes.
`get_data_size`	Given a `torch_geometric.data.Data` object, get its theoretical memory usage in bytes.
`get_cpu_memory_from_gc`	Returns the used CPU memory in bytes, as reported by the Python garbage collector.
`get_gpu_memory_from_gc`	Returns the used GPU memory in bytes, as reported by the Python garbage collector.
`get_gpu_memory_from_nvidia_smi`	Returns the free and used GPU memory in megabytes, as reported by `nivdia-smi`.
`benchmark`	Benchmark a list of functions `funcs` that receive the same set of arguments `args`.

profileit()[source]

A decorator to facilitate profiling a function, e.g., obtaining training runtime and memory statistics of a specific model on a specific dataset. Returns a Stats object with the attributes time, max_active_cuda, max_reserved_cuda, max_active_cuda, nvidia_smi_free_cuda, nvidia_smi_used_cuda.

@profileit()
def train(model, optimizer, x, edge_index, y):
    optimizer.zero_grad()
    out = model(x, edge_index)
    loss = criterion(out, y)
    loss.backward()
    optimizer.step()
    return float(loss)

loss, stats = train(model, x, edge_index, y)

class timeit(log: bool = True, avg_time_divisor: int = 0)[source]

A context decorator to facilitate timing a function, e.g., obtaining the runtime of a specific model on a specific dataset.

@torch.no_grad()
def test(model, x, edge_index):
    return model(x, edge_index)

with timeit() as t:
    z = test(model, x, edge_index)
time = t.duration

Parameters

log (bool, optional) – If set to False, will not log any runtime to the console. (default: True)
avg_time_divisor (int, optional) – If set to a value greater than 1, will divide the total time by this value. Useful for calculating the average of runtimes within a for-loop. (default: 0)

reset()[source]: Prints the duration and resets current timer.

get_stats_summary(stats_list: List[Stats])[source]

Creates a summary of collected runtime and memory statistics. Returns a StatsSummary object with the attributes time_mean, time_std, max_active_cuda, max_reserved_cuda, max_active_cuda, min_nvidia_smi_free_cuda, max_nvidia_smi_used_cuda.

Parameters: stats_list (List[Stats]) – A list of Stats objects, as returned by profileit().

trace_handler(p)[source]

print_time_total(p)[source]

rename_profile_file(*args)[source]

torch_profile(export_chrome_trace=True, csv_data=None, write_csv=None)[source]

count_parameters(model: Module) → int[source]

Given a torch.nn.Module, count its trainable parameters.

Parameters: model (torch.nn.Model) – The model.

get_model_size(model: Module) → int[source]

Given a torch.nn.Module, get its actual disk size in bytes.

Parameters: model (torch model) – The model.

get_data_size(data: BaseData) → int[source]

Given a torch_geometric.data.Data object, get its theoretical memory usage in bytes.

Parameters: data (torch_geometric.data.Data or torch_geometric.data.HeteroData) – The Data or HeteroData graph object.

get_cpu_memory_from_gc() → int[source]: Returns the used CPU memory in bytes, as reported by the Python garbage collector.

get_gpu_memory_from_gc(device: int = 0) → int[source]

Returns the used GPU memory in bytes, as reported by the Python garbage collector.

Parameters: device (int, optional) – The GPU device identifier. (default: 1)

get_gpu_memory_from_nvidia_smi(device: int = 0, digits: int = 2) → Tuple[float, float][source]

Returns the free and used GPU memory in megabytes, as reported by nivdia-smi.

Note

nvidia-smi will generally overestimate the amount of memory used by the actual program, see here.

Parameters

device (int, optional) – The GPU device identifier. (default: 1)
digits (int) – The number of decimals to use for megabytes. (default: 2)

benchmark(funcs: List[Callable], args: Union[Tuple[Any], List[Tuple[Any]]], num_steps: int, func_names: Optional[List[str]] = None, num_warmups: int = 10, backward: bool = False)[source]

Benchmark a list of functions funcs that receive the same set of arguments args.

Parameters

funcs ([Callable]) – The list of functions to benchmark.
args ((Any, ) or [(Any, )]) – The arguments to pass to the functions. Can be a list of arguments for each function in funcs in case their headers differ.
num_steps (int) – The number of steps to run the benchmark.
func_names ([str], optional) – The names of the functions. If not given, will try to infer the name from the function itself. (default: None)
num_warmups (int, optional) – The number of warmup steps. (default: 10)
backward (bool, optional) – If set to True, will benchmark both forward and backward passes. (default: False)