Profiling del Codice in Java

Edoardo Midali

Il profiling del codice è un processo sistematico per analizzare le performance di un’applicazione, identificare bottleneck e ottimizzare l’uso delle risorse. In Java, esistono numerosi strumenti e tecniche che permettono di ottenere insights dettagliati su CPU usage, memory allocation, garbage collection e I/O operations, fornendo la base per ottimizzazioni mirate ed efficaci.

Fondamenti del Profiling

Il profiling si basa sulla raccolta e analisi di metriche runtime che rivelano il comportamento dell’applicazione sotto carico. Questo processo è essenziale per identificare performance bottleneck che non sono evidenti durante lo sviluppo o testing su piccola scala.

Tipi di Profiling

CPU Profiling: Identifica metodi che consumano più tempo di esecuzione, hot spots nel codice e inefficienze algoritmiche.

Memory Profiling: Analizza allocazioni di memoria, memory leak, garbage collection overhead e object lifecycle.

I/O Profiling: Monitora operazioni di input/output, network latency e file system access patterns.

Thread Profiling: Esamina contention, deadlock, thread lifecycle e synchronization overhead.

public class ProfilingTargetApp {

    public static void main(String[] args) {
        ProfilingTargetApp app = new ProfilingTargetApp();

        // Simula workload per profiling
        for (int i = 0; i < 1000; i++) {
            app.processData();
            app.allocateMemory();
            app.performCalculations();

            if (i % 100 == 0) {
                System.out.println("Processed iteration: " + i);
            }
        }
    }

    // Metodo CPU-intensive per CPU profiling
    private void performCalculations() {
        double result = 0;
        for (int i = 0; i < 100000; i++) {
            result += Math.sqrt(i) * Math.sin(i);
        }
    }

    // Metodo memory-intensive per memory profiling
    private void allocateMemory() {
        List<String> tempList = new ArrayList<>();
        for (int i = 0; i < 1000; i++) {
            tempList.add("String " + i);
        }
        // tempList va out-of-scope e diventa eligible per GC
    }

    // Metodo con potenziali inefficienze
    private void processData() {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 100; i++) {
            sb.append("Data ").append(i).append(" ");
        }
        String result = sb.toString();
    }
}

Metodologie di Profiling

Sampling Profiling: Raccoglie campioni periodici dello stato dell’applicazione con overhead minimo.

Instrumentation Profiling: Inserisce codice di monitoraggio per raccolta dati precisa ma con overhead maggiore.

Hybrid Approaches: Combinano sampling e instrumentation per bilanciare precisione e performance.

Strumenti Built-in della JVM

La JVM fornisce diversi strumenti integrati per monitoring e profiling che non richiedono software aggiuntivo.

JVM Flags per Profiling

# CPU Profiling con Flight Recorder
-XX:+FlightRecorder
-XX:StartFlightRecording=duration=60s,filename=profile.jfr

# GC Logging dettagliato
-Xloggc:gc.log
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCApplicationStoppedTime

# Informazioni su compilation
-XX:+PrintCompilation
-XX:+UnlockDiagnosticVMOptions
-XX:+TraceClassLoading

# Memory debugging
-XX:+PrintStringDeduplicationStatistics
-XX:+PrintGCApplicationConcurrentTime

JFR (Java Flight Recorder)

JFR è un framework di profiling low-overhead integrato nella JVM che raccoglie eventi dettagliati sulle performance.

import jdk.jfr.Event;
import jdk.jfr.Description;
import jdk.jfr.Label;

public class JFRProfilingDemo {

    // Custom event per JFR
    @Label("Database Operation")
    @Description("Tracks database operation performance")
    static class DatabaseEvent extends Event {
        @Label("Operation Type")
        String operationType;

        @Label("Execution Time")
        long executionTime;

        @Label("Records Processed")
        int recordsProcessed;
    }

    public void performDatabaseOperation(String operation, int records) {
        DatabaseEvent event = new DatabaseEvent();
        event.operationType = operation;
        event.recordsProcessed = records;

        long startTime = System.nanoTime();
        event.begin(); // Inizia recording dell'evento

        try {
            // Simula operazione database
            simulateDatabaseWork(records);
        } finally {
            event.executionTime = System.nanoTime() - startTime;
            event.commit(); // Commit dell'evento a JFR
        }
    }

    private void simulateDatabaseWork(int records) {
        try {
            Thread.sleep(records / 10); // Simula latency proporzionale ai record
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }

    public static void demonstrateJFRProfiling() {
        JFRProfilingDemo demo = new JFRProfilingDemo();

        // Genera eventi per profiling
        String[] operations = {"SELECT", "INSERT", "UPDATE", "DELETE"};
        Random random = new Random();

        for (int i = 0; i < 100; i++) {
            String op = operations[random.nextInt(operations.length)];
            int records = random.nextInt(1000) + 1;
            demo.performDatabaseOperation(op, records);
        }
    }
}

Command Line Tools

# jstat - GC statistics
jstat -gc -t <pid> 1s    # GC stats ogni secondo
jstat -gccapacity <pid>   # Capacity informazioni
jstat -gcutil <pid> 5s    # Utilization percentages

# jstack - Thread dump
jstack <pid>              # Stack trace di tutti i thread
jstack -l <pid>           # Include lock information

# jmap - Memory mapping
jmap -histo <pid>         # Histogram delle classi
jmap -dump:format=b,file=heap.hprof <pid>  # Heap dump

# jcmd - Multipurpose tool
jcmd <pid> VM.classloader_stats    # Classloader statistics
jcmd <pid> GC.run                  # Force GC
jcmd <pid> Thread.print            # Thread dump
jcmd <pid> JFR.start duration=60s filename=recording.jfr

Strumenti di Profiling Avanzati

JVisualVM

JVisualVM fornisce un’interfaccia grafica completa per monitoring e profiling.

public class VisualVMDemo {

    // Metodi progettati per essere facilmente identificabili in VisualVM
    public static void main(String[] args) {
        VisualVMDemo demo = new VisualVMDemo();

        System.out.println("Starting VisualVM demo - attach profiler now");

        // Pausa per permettere attach del profiler
        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }

        // Workload osservabile
        demo.runCPUIntensiveTask();
        demo.runMemoryIntensiveTask();
        demo.runConcurrentTask();

        System.out.println("Demo completed");
    }

    // Task identificabile come hot spot
    private void runCPUIntensiveTask() {
        System.out.println("Running CPU intensive task...");
        long startTime = System.currentTimeMillis();

        // Calcolo intensivo osservabile
        double result = 0;
        for (int i = 0; i < 10_000_000; i++) {
            result += Math.sin(i) * Math.cos(i);
        }

        long endTime = System.currentTimeMillis();
        System.out.println("CPU task completed in: " + (endTime - startTime) + "ms");
    }

    // Task per memory profiling
    private void runMemoryIntensiveTask() {
        System.out.println("Running memory intensive task...");

        List<byte[]> memoryHog = new ArrayList<>();

        // Allocazioni osservabili
        for (int i = 0; i < 1000; i++) {
            byte[] largeArray = new byte[100_000]; // 100KB per array
            Arrays.fill(largeArray, (byte) i);
            memoryHog.add(largeArray);

            if (i % 100 == 0) {
                // Forza GC periodicamente per osservare comportamento
                System.gc();
            }
        }

        System.out.println("Memory task completed, allocated: " + memoryHog.size() + " arrays");
    }

    // Task concorrente per thread profiling
    private void runConcurrentTask() {
        System.out.println("Running concurrent task...");

        ExecutorService executor = Executors.newFixedThreadPool(4);
        List<Future<?>> futures = new ArrayList<>();

        // Tasks paralleli osservabili
        for (int i = 0; i < 8; i++) {
            final int taskId = i;
            Future<?> future = executor.submit(() -> {
                simulateWork(taskId);
            });
            futures.add(future);
        }

        // Attendi completamento
        for (Future<?> future : futures) {
            try {
                future.get();
            } catch (InterruptedException | ExecutionException e) {
                e.printStackTrace();
            }
        }

        executor.shutdown();
        System.out.println("Concurrent task completed");
    }

    private void simulateWork(int taskId) {
        try {
            Thread.sleep(1000 + taskId * 200); // Simula workload variabile

            // Calcolo per ogni thread
            double result = 0;
            for (int i = 0; i < 1_000_000; i++) {
                result += Math.random();
            }

            System.out.println("Task " + taskId + " completed with result: " +
                             String.format("%.2f", result));
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }
}

Profiling Programmatico

import java.lang.management.*;

public class ProgrammaticProfiling {

    private final MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();
    private final List<GarbageCollectorMXBean> gcBeans = ManagementFactory.getGarbageCollectorMXBeans();
    private final ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();

    // Monitoring continuo delle metriche
    public void startContinuousMonitoring() {
        ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);

        scheduler.scheduleAtFixedRate(() -> {
            collectAndLogMetrics();
        }, 0, 5, TimeUnit.SECONDS);

        System.out.println("Continuous monitoring started");
    }

    private void collectAndLogMetrics() {
        // Memory metrics
        MemoryUsage heapUsage = memoryBean.getHeapMemoryUsage();
        long usedHeap = heapUsage.getUsed();
        long maxHeap = heapUsage.getMax();
        double heapUtilization = (double) usedHeap / maxHeap * 100;

        // GC metrics
        long totalGCTime = 0;
        long totalGCCollections = 0;
        for (GarbageCollectorMXBean gcBean : gcBeans) {
            totalGCTime += gcBean.getCollectionTime();
            totalGCCollections += gcBean.getCollectionCount();
        }

        // Thread metrics
        int threadCount = threadBean.getThreadCount();
        long[] deadlockedThreads = threadBean.findDeadlockedThreads();

        // CPU metrics (se supportato)
        long cpuTime = 0;
        if (threadBean.isCurrentThreadCpuTimeSupported()) {
            cpuTime = threadBean.getCurrentThreadCpuTime();
        }

        // Log metrics
        System.out.printf("METRICS: Heap=%.1f%%, GC=%dms (%d collections), Threads=%d, CPU=%dms%n",
                         heapUtilization, totalGCTime, totalGCCollections, threadCount, cpuTime / 1_000_000);

        if (deadlockedThreads != null) {
            System.out.println("WARNING: Deadlock detected!");
        }
    }

    // Performance measurement utilities
    public static class PerformanceTimer {
        private final Map<String, Long> startTimes = new ConcurrentHashMap<>();
        private final Map<String, List<Long>> measurements = new ConcurrentHashMap<>();

        public void start(String operation) {
            startTimes.put(operation, System.nanoTime());
        }

        public void stop(String operation) {
            Long startTime = startTimes.remove(operation);
            if (startTime != null) {
                long duration = System.nanoTime() - startTime;
                measurements.computeIfAbsent(operation, k -> new ArrayList<>()).add(duration);
            }
        }

        public void printStatistics() {
            for (Map.Entry<String, List<Long>> entry : measurements.entrySet()) {
                String operation = entry.getKey();
                List<Long> times = entry.getValue();

                if (!times.isEmpty()) {
                    long min = Collections.min(times);
                    long max = Collections.max(times);
                    double avg = times.stream().mapToLong(Long::longValue).average().orElse(0);

                    System.out.printf("%s: min=%.2fms, max=%.2fms, avg=%.2fms, count=%d%n",
                                     operation, min/1e6, max/1e6, avg/1e6, times.size());
                }
            }
        }
    }

    // Esempio di utilizzo del timer
    public static void demonstratePerformanceTimer() {
        PerformanceTimer timer = new PerformanceTimer();

        // Misura diverse operazioni
        for (int i = 0; i < 100; i++) {
            timer.start("string-operation");
            String result = performStringOperation();
            timer.stop("string-operation");

            timer.start("math-operation");
            double mathResult = performMathOperation();
            timer.stop("math-operation");

            timer.start("collection-operation");
            List<Integer> collectionResult = performCollectionOperation();
            timer.stop("collection-operation");
        }

        timer.printStatistics();
    }

    private static String performStringOperation() {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 1000; i++) {
            sb.append("String ").append(i);
        }
        return sb.toString();
    }

    private static double performMathOperation() {
        double result = 0;
        for (int i = 0; i < 10000; i++) {
            result += Math.sin(i) * Math.cos(i);
        }
        return result;
    }

    private static List<Integer> performCollectionOperation() {
        List<Integer> list = new ArrayList<>();
        for (int i = 0; i < 1000; i++) {
            list.add(i);
        }
        Collections.sort(list);
        return list;
    }
}

Memory Profiling Avanzato

Heap Dump Analysis

public class HeapDumpAnalysis {

    // Genera condizioni interessanti per heap dump analysis
    public static void main(String[] args) {
        HeapDumpAnalysis analyzer = new HeapDumpAnalysis();

        System.out.println("Creating memory patterns for heap dump analysis...");

        // Pattern 1: Memory leak simulation
        analyzer.simulateMemoryLeak();

        // Pattern 2: Large object retention
        analyzer.createLargeObjects();

        // Pattern 3: String interning patterns
        analyzer.demonstrateStringPatterns();

        System.out.println("Memory patterns created. Take heap dump now using:");
        System.out.println("jcmd " + ProcessHandle.current().pid() + " GC.dump filename=analysis.hprof");

        // Mantieni oggetti in memoria per analysis
        try {
            Thread.sleep(60000); // 1 minuto per generare heap dump
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }

    private final List<Object> leakyList = new ArrayList<>();
    private final Map<String, byte[]> largeObjectStore = new HashMap<>();

    private void simulateMemoryLeak() {
        // Simula memory leak con accumulo di oggetti
        for (int i = 0; i < 10000; i++) {
            LeakyObject obj = new LeakyObject("Object " + i);
            leakyList.add(obj);

            // Aggiunge references che potrebbero essere dimenticate
            obj.addReference(new HelperObject());
        }

        System.out.println("Created " + leakyList.size() + " potentially leaky objects");
    }

    private void createLargeObjects() {
        // Crea oggetti grandi facilmente identificabili nell'heap dump
        for (int i = 0; i < 50; i++) {
            String key = "LargeObject_" + i;
            byte[] largeArray = new byte[1024 * 1024]; // 1MB per oggetto
            Arrays.fill(largeArray, (byte) i);
            largeObjectStore.put(key, largeArray);
        }

        System.out.println("Created " + largeObjectStore.size() + " large objects");
    }

    private void demonstrateStringPatterns() {
        List<String> strings = new ArrayList<>();

        // Pattern con potenziale per string deduplication
        String[] baseStrings = {"ACTIVE", "INACTIVE", "PENDING", "COMPLETED"};

        for (int i = 0; i < 1000; i++) {
            // Crea duplicati che potrebbero essere ottimizzati
            String duplicated = new String(baseStrings[i % baseStrings.length]);
            strings.add(duplicated);

            // Mescola con stringhe uniche
            strings.add("Unique_" + i);
        }

        System.out.println("Created " + strings.size() + " string objects");
    }

    static class LeakyObject {
        private final String name;
        private final List<HelperObject> references = new ArrayList<>();
        private final byte[] data = new byte[1024]; // 1KB per oggetto

        public LeakyObject(String name) {
            this.name = name;
        }

        public void addReference(HelperObject helper) {
            references.add(helper);
        }
    }

    static class HelperObject {
        private final String id = UUID.randomUUID().toString();
        private final byte[] buffer = new byte[512]; // 512 bytes
    }
}

Memory Leak Detection

public class MemoryLeakDetection {

    // Weak reference tracker per leak detection
    private static final Map<String, WeakReference<Object>> trackedObjects = new ConcurrentHashMap<>();
    private static final ReferenceQueue<Object> referenceQueue = new ReferenceQueue<>();

    public static void trackObject(String id, Object obj) {
        WeakReference<Object> ref = new WeakReference<>(obj, referenceQueue);
        trackedObjects.put(id, ref);
    }

    public static void startLeakDetection() {
        Thread cleanupThread = new Thread(() -> {
            try {
                while (true) {
                    Reference<?> ref = referenceQueue.remove(1000); // 1 second timeout
                    if (ref != null) {
                        // Oggetto è stato GC'd - rimuovi dal tracking
                        trackedObjects.entrySet().removeIf(entry -> entry.getValue() == ref);
                    }

                    // Periodicamente controlla per potential leaks
                    checkForPotentialLeaks();
                }
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        });

        cleanupThread.setDaemon(true);
        cleanupThread.start();
    }

    private static void checkForPotentialLeaks() {
        // Suggerisci GC e controlla oggetti ancora referenziati
        System.gc();

        try {
            Thread.sleep(100); // Attendi GC
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            return;
        }

        long potentialLeaks = trackedObjects.entrySet().stream()
            .filter(entry -> entry.getValue().get() != null)
            .count();

        if (potentialLeaks > 0) {
            System.out.println("Potential memory leaks detected: " + potentialLeaks + " objects");
        }
    }

    // Example usage
    public static void demonstrateLeakDetection() {
        startLeakDetection();

        // Crea oggetti che dovrebbero essere GC'd
        for (int i = 0; i < 100; i++) {
            Object temp = new Object();
            trackObject("temp_" + i, temp);
        }

        // Crea oggetti che potrebbero fare leak
        List<Object> leaked = new ArrayList<>();
        for (int i = 0; i < 50; i++) {
            Object persistent = new Object();
            leaked.add(persistent); // Mantiene strong reference
            trackObject("leaked_" + i, persistent);
        }

        System.out.println("Created objects for leak detection test");

        // Attendi per osservare leak detection
        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }
}

Ottimizzazione Basata su Profiling

Performance Hotspot Identification

public class HotspotOptimization {

    // Metodo con hotspot identificabile
    public static void inefficientStringProcessing(List<String> data) {
        String result = "";

        // HOTSPOT: String concatenation in loop
        for (String item : data) {
            result += item + " "; // Inefficiente - crea nuovi oggetti String
        }

        System.out.println("Processed " + data.size() + " items");
    }

    // Versione ottimizzata dopo profiling
    public static void optimizedStringProcessing(List<String> data) {
        StringBuilder sb = new StringBuilder(data.size() * 20); // Pre-size estimate

        for (String item : data) {
            sb.append(item).append(" "); // Efficiente - riutilizza buffer
        }

        String result = sb.toString();
        System.out.println("Processed " + data.size() + " items efficiently");
    }

    // Benchmark per confrontare performance
    public static void benchmarkStringProcessing() {
        List<String> testData = new ArrayList<>();
        for (int i = 0; i < 10000; i++) {
            testData.add("Item " + i);
        }

        // Misura versione inefficiente
        long startTime = System.nanoTime();
        inefficientStringProcessing(testData);
        long inefficientTime = System.nanoTime() - startTime;

        // Misura versione ottimizzata
        startTime = System.nanoTime();
        optimizedStringProcessing(testData);
        long optimizedTime = System.nanoTime() - startTime;

        System.out.printf("Inefficient: %.2f ms%n", inefficientTime / 1e6);
        System.out.printf("Optimized: %.2f ms%n", optimizedTime / 1e6);
        System.out.printf("Speedup: %.1fx%n", (double) inefficientTime / optimizedTime);
    }

    // Profiling-guided optimization per algoritmi
    public static class AlgorithmOptimization {

        // Versione naive - identificata come hotspot
        public static boolean isPrimeNaive(int n) {
            if (n <= 1) return false;

            for (int i = 2; i < n; i++) { // Inefficiente: controlla tutti i numeri
                if (n % i == 0) return false;
            }
            return true;
        }

        // Versione ottimizzata dopo profiling
        public static boolean isPrimeOptimized(int n) {
            if (n <= 1) return false;
            if (n <= 3) return true;
            if (n % 2 == 0 || n % 3 == 0) return false;

            // Controlla solo fino a sqrt(n) e salta numeri pari
            for (int i = 5; i * i <= n; i += 6) {
                if (n % i == 0 || n % (i + 2) == 0) return false;
            }
            return true;
        }

        public static void benchmarkPrimeAlgorithms() {
            int limit = 100000;

            // Benchmark naive
            long startTime = System.nanoTime();
            int naiveCount = 0;
            for (int i = 0; i < limit; i++) {
                if (isPrimeNaive(i)) naiveCount++;
            }
            long naiveTime = System.nanoTime() - startTime;

            // Benchmark optimized
            startTime = System.nanoTime();
            int optimizedCount = 0;
            for (int i = 0; i < limit; i++) {
                if (isPrimeOptimized(i)) optimizedCount++;
            }
            long optimizedTime = System.nanoTime() - startTime;

            System.out.printf("Naive algorithm: %d primes in %.2f ms%n",
                             naiveCount, naiveTime / 1e6);
            System.out.printf("Optimized algorithm: %d primes in %.2f ms%n",
                             optimizedCount, optimizedTime / 1e6);
            System.out.printf("Speedup: %.1fx%n", (double) naiveTime / optimizedTime);
        }
    }
}

Best Practices

Profile in Production-Like Environment: Le performance possono differire significativamente tra sviluppo e produzione.

Focus sui Bottleneck Reali: Usa profiling per identificare hotspot effettivi, non ottimizzare prematuramente.

Misura Prima e Dopo: Quantifica sempre l’impatto delle ottimizzazioni.

Consider Overhead: Alcuni profiler introducono overhead significativo - usa appropriatamente.

Monitor Continuously: Implementa monitoring continuo per identificare regressioni di performance.

Profile Different Scenarios: Testa con diversi load patterns e data sizes.

Document Findings: Mantieni documentazione su profiling findings e ottimizzazioni applicate.

Conclusione

Il profiling del codice è un processo iterativo che richiede strumenti appropriati, metodologie sistematiche e interpretazione accurata dei risultati. La combinazione di strumenti built-in della JVM, profiler specializzati e monitoring programmatico fornisce una visione completa delle performance dell’applicazione.

L’efficacia del profiling dipende dalla capacità di identificare i reali bottleneck e applicare ottimizzazioni mirate. Un approccio data-driven basato su measurement accurato è essenziale per evitare ottimizzazioni premature e concentrare gli sforzi sui componenti che hanno maggiore impatto sulle performance complessive.

Con l’evoluzione della JVM e l’introduzione di nuovi strumenti come JFR, il profiling diventa sempre più preciso e meno invasivo, permettendo analisi approfondite anche in ambienti di produzione.