second

2026-04-16 20:41:31 -04:00
parent 9ad604196e
commit 269d560847
1 changed files with 251 additions and 0 deletions
--- a/analysis/03-Two_Sum_Is_Not_About_Numbers/readme.md
+++ b/analysis/03-Two_Sum_Is_Not_About_Numbers/readme.md
@@ -0,0 +1,251 @@
+# Analysis #XX — Two Sum Is Not About Numbers
+
+## Problem
+
+At first glance, the problem looks trivial:
+
+> Given a list of values, find two elements whose sum equals a target.
+
+This is one of the most well-known interview questions, commonly referred to as **Two Sum**.
+
+It is simple, clean, and perfectly defined:
+- a static array
+- exact arithmetic
+- a guaranteed answer
+
+And that’s exactly why it works so well in interviews.
+
+---
+
+## Typical Interview Thinking
+
+A candidate is expected to go through a familiar progression:
+
+1. Start with brute force (O(n²))
+2. Recognize inefficiency
+3. Optimize using a hash map
+4. Achieve O(n) time complexity
+
+```cpp
+unordered_map<int, int> seen;
+
+for (int i = 0; i < n; ++i) {
+    int complement = target - nums[i];
+
+    if (seen.count(complement)) {
+        return {seen[complement], i};
+    }
+
+    seen[nums[i]] = i;
+}
+```
+
+The “correct” answer is not about solving the problem.
+
+It is about recognizing the pattern.
+
+---
+
+## What This Actually Tests
+
+Despite its simplicity, this problem evaluates:
+
+- familiarity with standard patterns
+- ability to choose a data structure
+- understanding of time complexity
+
+But most importantly:
+
+> it tests whether you have seen this problem before.
+
+---
+
+## A Subtle Shift
+
+Now let’s take the same idea and move it one step closer to reality.
+
+Instead of numbers, we have **log events**.
+
+Instead of a static array, we have a **stream**.
+
+Instead of a clean equality, we have **imperfect data and thresholds**.
+
+---
+
+## Synthetic Log Example
+
+```
+2026-04-16T10:15:01.123Z service=api    event=parse_input   latency=12ms request_id=req-1001
+2026-04-16T10:15:01.130Z service=cache  event=cache_miss    latency=48ms request_id=req-1001
+2026-04-16T10:15:01.135Z service=db     event=read_user     latency=55ms request_id=req-1001
+2026-04-16T10:15:01.144Z service=net    event=external_call latency=47ms request_id=req-1001
+2026-04-16T10:15:01.151Z service=cache  event=cache_miss    latency=60ms request_id=req-3001
+2026-04-16T10:15:01.154Z service=net    event=external_call latency=52ms request_id=req-3001
+```
+
+---
+
+## Real Problem
+
+We are no longer asked to find two numbers.
+
+Instead, the problem becomes:
+
+> Detect whether there exist two events:
+> - belonging to the same request
+> - occurring close in time
+> - whose combined latency exceeds a threshold
+
+This still *looks* like Two Sum.
+
+But it is not.
+
+---
+
+## Where the Interview Model Breaks
+
+### 1. No Exact Match
+
+Interview version:
+```
+a + b == target
+```
+
+Real version:
+```
+a + b > threshold
+```
+
+We are not searching for a perfect complement.
+We are evaluating a condition.
+
+---
+
+### 2. Context Is Mandatory
+
+You cannot combine arbitrary events.
+
+A latency spike only makes sense **within the same request**.
+
+Without context, the result is meaningless.
+
+---
+
+### 3. Time Matters
+
+Events are not just values — they exist in time.
+
+Two events five seconds apart may not be related at all.
+
+This introduces:
+- time windows
+- ordering issues
+- temporal constraints
+
+---
+
+### 4. Data Is Not Static
+
+LeetCode assumes:
+- full dataset
+- already loaded
+- perfectly ordered
+
+Reality:
+- streaming input
+- delayed events
+- missing entries
+- out-of-order delivery
+
+---
+
+## What the Problem Really Becomes
+
+At this point, the challenge is no longer:
+
+> “find two numbers”
+
+It becomes:
+
+> “determine which events are comparable at all”
+
+And that is a fundamentally different problem.
+
+---
+
+## Real Engineering Approach
+
+Instead of solving a mathematical puzzle, we build a system.
+
+### Core Idea
+
+Maintain a sliding window of recent events per request.
+
+### Pseudocode
+
+```
+for each incoming event:
+    bucket = active_events[event.request_id]
+
+    remove events outside time window
+
+    for each old_event in bucket:
+        if event.latency + old_event.latency > threshold:
+            report anomaly
+
+    add event to bucket
+```
+
+---
+
+## What This Introduces
+
+Now we must deal with:
+
+- bounded memory
+- streaming constraints
+- time-based eviction
+- correlation logic
+
+And beyond that:
+
+- out-of-order events
+- duplicate logs
+- partial data
+- noise filtering
+
+---
+
+## The Real Insight
+
+The difficulty is not in computing a sum.
+
+The difficulty is in defining:
+
+- what data is valid
+- what events belong together
+- what “close enough” means
+- how the system behaves under imperfect conditions
+
+---
+
+## Key Takeaway
+
+Two Sum is often presented as a problem about numbers.
+
+In reality, it is a problem about assumptions.
+
+Remove those assumptions, and the problem changes completely.
+
+> The challenge is not finding two values.  
+> The challenge is understanding whether those values should ever be compared.
+
+---
+
+## Project Perspective
+
+Exists in real engineering?  
+→ Yes, but as event correlation under constraints
+
+Exists in interview form?  
+→ Yes, but stripped of context and complexity