Profiling Data Management
When performing complex profiling, developers often find themselves lost in a maze of repetitive commands and scattered files. You run go test -bench=BenchmarkMyFunc -cpuprofile=cpu.out
, then go tool pprof -top cpu.out > results.txt
, inspect a function with go tool pprof -list=MyFunc cpu.out
, make modifications, run the benchmark again—and hours later, you're exhausted, have dozens of inconsistently named files scattered across directories, and can't remember which changes led to which results. Without systematic organization, you lose track of your optimization journey, lack accurate "before and after" snapshots to share with your team, and waste valuable time context-switching between profiling commands instead of focusing on actual performance improvements. Prof eliminates this chaos by capturing everything in one command and automatically organizing all profiling data—binary files, text reports, function-level analysis, and visualizations—into a structured, tagged hierarchy that preserves your optimization history and makes collaboration effortless.
Auto
The auto
command wraps go test
and pprof
to run benchmarks, collect all profile types, and organize everything automatically:
prof auto --benchmarks "BenchmarkGenPool" --profiles "cpu,memory,mutex,block" --count 10 --tag "baseline"
This single command replaces dozens of manual steps and creates a complete, organized profiling dataset ready for analysis or comparison.
Output Structure:
bench/baseline/
├── description.txt # User documentation for this run
├── bin/BenchmarkGenPool/ # Binary profile files
│ ├── BenchmarkGenPool_cpu.out
│ ├── BenchmarkGenPool_memory.out
│ ├── BenchmarkGenPool_mutex.out
│ └── BenchmarkGenPool_block.out
├── text/BenchmarkGenPool/ # Text reports & benchmark output
│ ├── BenchmarkGenPool_cpu.txt
│ ├── BenchmarkGenPool_memory.txt
│ └── BenchmarkGenPool.txt
├── cpu_functions/BenchmarkGenPool/ # Function-level CPU profile data
│ ├── Put.txt
│ ├── Get.txt
│ └── getShard.txt
└── memory_functions/BenchmarkGenPool/ # Function-level memory profile data
├── Put.txt
└── allocator.txt
Auto - Configuration
By default, prof collects all functions shown in the text report of a profile. To customize this behavior, run:
prof setup
This creates a configuration file with the following structure:
{
"function_collection_filter": {
"BenchmarkGenPool": {
"include_prefixes": ["github.com/example/GenPool"],
"ignore_functions": ["init", "TestMain", "BenchmarkMain"]
}
}
}
Configuration Options:
BenchmarkGenPool
: Replace it with your benchmark function name, or with"*"
to apply for all benchmarks.include_prefixes
: Only collect functions whose names start with these prefixes.ignore_functions
: Exclude specific functions from collection, even if they match the include prefixes.
This filtering helps focus profiling on relevant code paths while excluding test setup and initialization functions that may not be meaningful for performance analysis.
Manual
The manual
command processes existing profile files without running benchmarks - it only uses pprof
to organize data you already have:
prof manual --tag "external-profiles" BenchmarkGenPool_cpu.out memory.out block.out
This organizes your existing profile files into a flatter structure based on the profile filename:
Manual Output Structure:
bench/external-profiles/
├── BenchmarkGenPool_cpu/
│ ├── BenchmarkGenPool_cpu.txt # Text report
│ └── functions/ # Function-level profile data
│ ├── Put.txt
│ ├── Get.txt
│ └── getShard.txt
├── memory/
│ ├── memory.txt # Text report
│ └── functions/ # Function-level profile data
│ └── allocator.txt
└── block/
├── block.txt # Text report
└── functions/ # Function-level profile data
└── runtime.txt
Manual - Configuration
The configuration works the same as auto configuration, except you should use profile file base names (without extensions) instead of benchmark names:
{
"function_collection_filter": {
"BenchmarkGenPool_cpu": {
"include_prefixes": ["github.com/example/GenPool"],
"ignore_functions": ["init", "TestMain", "BenchmarkMain"]
}
}
}
For example, BenchmarkGenPool_cpu.out
becomes BenchmarkGenPool_cpu
in the configuration.
Performance Comparison
Prof's performance comparison automatically drills down from benchmark-level changes to show you exactly which functions changed. Instead of just reporting that performance improved or regressed, Prof pinpoints the specific functions responsible and shows you detailed before-and-after comparisons.
Track Auto
Use track auto
when comparing data collected with prof auto
. Simply reference the tag names:
prof track auto --base "baseline" --current "optimized" \
--profile-type "cpu" --bench-name "BenchmarkGenPool" \
--output-format "summary"
prof track auto --base "baseline" --current "optimized" \
--profile-type "cpu" --bench-name "BenchmarkGenPool" \
--output-format "detailed"
Track Manual
Use track manual
when comparing external profile files by specifying their relative paths:
prof track manual --base path/to/base/report/cpu.txt \
--current path/to/current/report/cpu.txt \
--output-format "summary"
prof track manual --base path/to/base/report/cpu.txt \
--current path/to/current/report/cpu.txt \
--output-format "detailed"
Output Formats
Prof's performance comparison provides multiple output formats to help you understand performance changes at different levels of detail and presentation. Currently supported formats:
- Terminal (default)
- HTML
- JSON
Summary Format
The summary format gives you a high-level overview of all performance changes, organized by impact:
==== Performance Tracking Summary ====
Total Functions Analyzed: 78
Regressions: 9
Improvements: 8
Stable: 61
⚠️ Top Regressions (worst first):
• internal/cache.getShard: +200.0% (0.030s → 0.090s)
• internal/hash.Spread: +180.0% (0.050s → 0.140s)
• pool/acquire: +150.0% (0.020s → 0.050s)
• encoding/json.Marshal: +125.0% (0.080s → 0.180s)
• sync.Pool.Get: +100.0% (0.010s → 0.020s)
✅ Top Improvements (best first):
• compress/gzip.NewWriter: -100.0% (0.020s → 0.000s)
• internal/metrics.resetCounters: -100.0% (0.010s → 0.000s)
• encoding/json.Unmarshal: -95.0% (0.100s → 0.005s)
• net/url.ParseQuery: -90.0% (0.050s → 0.005s)
• pool/isFull: -85.0% (0.020s → 0.003s)
Detailed Format
The detailed format provides comprehensive analysis for each changed function, including impact assessment and action recommendations:
📊 Summary: 78 total functions | 🔴 9 regressions | 🟢 8 improvements | ⚪ 61 stable
📋 Report Order: Regressions first (worst → best), then Improvements (best → worst), then Stable
║ ║ ║ ║ ║ ║ ║ PERFORMANCE CHANGE REPORT
Function: github.com/Random/Pool/pool.getShard
Analysis Time: 2025-07-23 15:51:59 PDT
Change Type: REGRESSION
⚠️ Performance regression detected
║ ║ ║ ║ ║ FLAT TIME ANALYSIS
Before: 0.030000s
After: 0.090000s
Delta: +0.060000s
Change: +200.00%
Impact: Function is 200.00% SLOWER
║ ║ ║ ║ ║ CUMULATIVE TIME ANALYSIS
Before: 0.030s
After: 0.100s
Delta: +0.070s
Change: +233.33%
║ ║ ║ ║ ║ IMPACT ASSESSMENT
Severity: CRITICAL
Recommendation: Critical regression! Immediate investigation required.
HTML & JSON Output
In addition to terminal display, Prof can export both summary and detailed reports in:
- 📄 HTML: shareable and human-friendly
- 🧩 JSON: structured format for programmatic use or further integration
--output-format summary-html
--output-format detailed-json