This commit is contained in:
2026-04-16 20:41:31 -04:00
parent 9ad604196e
commit 269d560847

View File

@@ -0,0 +1,251 @@
# Analysis #XX — Two Sum Is Not About Numbers
## Problem
At first glance, the problem looks trivial:
> Given a list of values, find two elements whose sum equals a target.
This is one of the most well-known interview questions, commonly referred to as **Two Sum**.
It is simple, clean, and perfectly defined:
- a static array
- exact arithmetic
- a guaranteed answer
And thats exactly why it works so well in interviews.
---
## Typical Interview Thinking
A candidate is expected to go through a familiar progression:
1. Start with brute force (O(n²))
2. Recognize inefficiency
3. Optimize using a hash map
4. Achieve O(n) time complexity
```cpp
unordered_map<int, int> seen;
for (int i = 0; i < n; ++i) {
int complement = target - nums[i];
if (seen.count(complement)) {
return {seen[complement], i};
}
seen[nums[i]] = i;
}
```
The “correct” answer is not about solving the problem.
It is about recognizing the pattern.
---
## What This Actually Tests
Despite its simplicity, this problem evaluates:
- familiarity with standard patterns
- ability to choose a data structure
- understanding of time complexity
But most importantly:
> it tests whether you have seen this problem before.
---
## A Subtle Shift
Now lets take the same idea and move it one step closer to reality.
Instead of numbers, we have **log events**.
Instead of a static array, we have a **stream**.
Instead of a clean equality, we have **imperfect data and thresholds**.
---
## Synthetic Log Example
```
2026-04-16T10:15:01.123Z service=api event=parse_input latency=12ms request_id=req-1001
2026-04-16T10:15:01.130Z service=cache event=cache_miss latency=48ms request_id=req-1001
2026-04-16T10:15:01.135Z service=db event=read_user latency=55ms request_id=req-1001
2026-04-16T10:15:01.144Z service=net event=external_call latency=47ms request_id=req-1001
2026-04-16T10:15:01.151Z service=cache event=cache_miss latency=60ms request_id=req-3001
2026-04-16T10:15:01.154Z service=net event=external_call latency=52ms request_id=req-3001
```
---
## Real Problem
We are no longer asked to find two numbers.
Instead, the problem becomes:
> Detect whether there exist two events:
> - belonging to the same request
> - occurring close in time
> - whose combined latency exceeds a threshold
This still *looks* like Two Sum.
But it is not.
---
## Where the Interview Model Breaks
### 1. No Exact Match
Interview version:
```
a + b == target
```
Real version:
```
a + b > threshold
```
We are not searching for a perfect complement.
We are evaluating a condition.
---
### 2. Context Is Mandatory
You cannot combine arbitrary events.
A latency spike only makes sense **within the same request**.
Without context, the result is meaningless.
---
### 3. Time Matters
Events are not just values — they exist in time.
Two events five seconds apart may not be related at all.
This introduces:
- time windows
- ordering issues
- temporal constraints
---
### 4. Data Is Not Static
LeetCode assumes:
- full dataset
- already loaded
- perfectly ordered
Reality:
- streaming input
- delayed events
- missing entries
- out-of-order delivery
---
## What the Problem Really Becomes
At this point, the challenge is no longer:
> “find two numbers”
It becomes:
> “determine which events are comparable at all”
And that is a fundamentally different problem.
---
## Real Engineering Approach
Instead of solving a mathematical puzzle, we build a system.
### Core Idea
Maintain a sliding window of recent events per request.
### Pseudocode
```
for each incoming event:
bucket = active_events[event.request_id]
remove events outside time window
for each old_event in bucket:
if event.latency + old_event.latency > threshold:
report anomaly
add event to bucket
```
---
## What This Introduces
Now we must deal with:
- bounded memory
- streaming constraints
- time-based eviction
- correlation logic
And beyond that:
- out-of-order events
- duplicate logs
- partial data
- noise filtering
---
## The Real Insight
The difficulty is not in computing a sum.
The difficulty is in defining:
- what data is valid
- what events belong together
- what “close enough” means
- how the system behaves under imperfect conditions
---
## Key Takeaway
Two Sum is often presented as a problem about numbers.
In reality, it is a problem about assumptions.
Remove those assumptions, and the problem changes completely.
> The challenge is not finding two values.
> The challenge is understanding whether those values should ever be compared.
---
## Project Perspective
Exists in real engineering?
→ Yes, but as event correlation under constraints
Exists in interview form?
→ Yes, but stripped of context and complexity