Closes #01
This commit is contained in:
@@ -1,173 +0,0 @@
|
||||
# Analysis #XX — Two Sum Is Not About Numbers
|
||||
|
||||
## Problem (LeetCode-style)
|
||||
|
||||
You are given a list of records. Each record contains:
|
||||
|
||||
- an identifier
|
||||
- a value
|
||||
- optional metadata
|
||||
|
||||
Your task is to find whether there exists a pair of records whose values sum to a given target.
|
||||
|
||||
Return the identifiers of any such pair.
|
||||
|
||||
Constraints:
|
||||
- Each record may be used at most once
|
||||
- At most one valid answer exists
|
||||
|
||||
---
|
||||
|
||||
## Typical Interview Thinking
|
||||
|
||||
1. Start with brute force:
|
||||
- Check all pairs → O(n²)
|
||||
|
||||
2. Optimize:
|
||||
- Use a hash map
|
||||
- Store seen values
|
||||
- Lookup complement (target - value)
|
||||
|
||||
```cpp
|
||||
unordered_map<int, int> seen;
|
||||
|
||||
for (int i = 0; i < n; ++i) {
|
||||
int complement = target - nums[i];
|
||||
|
||||
if (seen.count(complement)) {
|
||||
return {seen[complement], i};
|
||||
}
|
||||
|
||||
seen[nums[i]] = i;
|
||||
}
|
||||
```
|
||||
|
||||
Time complexity: O(n)
|
||||
Space complexity: O(n)
|
||||
|
||||
---
|
||||
|
||||
## What This Actually Tests
|
||||
|
||||
- Pattern recognition
|
||||
- Familiarity with hash maps
|
||||
- Knowledge of time complexity
|
||||
- Prior exposure to the problem
|
||||
|
||||
---
|
||||
|
||||
## Real-World Version (Logs & Event Correlation)
|
||||
|
||||
### Synthetic Log Example
|
||||
|
||||
```
|
||||
2026-04-16T10:15:01.123Z service=api event=parse_input latency=12ms request_id=req-1001
|
||||
2026-04-16T10:15:01.130Z service=cache event=cache_miss latency=48ms request_id=req-1001
|
||||
2026-04-16T10:15:01.135Z service=db event=read_user latency=55ms request_id=req-1001
|
||||
2026-04-16T10:15:01.144Z service=net event=external_call latency=47ms request_id=req-1001
|
||||
2026-04-16T10:15:01.151Z service=cache event=cache_miss latency=60ms request_id=req-3001
|
||||
2026-04-16T10:15:01.154Z service=net event=external_call latency=52ms request_id=req-3001
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Real Problem
|
||||
|
||||
Detect whether there exist two events:
|
||||
|
||||
- belonging to the same request_id
|
||||
- occurring within a time window
|
||||
- whose combined latency exceeds a threshold
|
||||
|
||||
---
|
||||
|
||||
## Where LeetCode Logic Breaks
|
||||
|
||||
### 1. Not Exact Match
|
||||
LeetCode:
|
||||
```
|
||||
a + b == target
|
||||
```
|
||||
|
||||
Reality:
|
||||
```
|
||||
a + b > threshold
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Context Matters (request_id)
|
||||
|
||||
You cannot mix unrelated events.
|
||||
|
||||
---
|
||||
|
||||
### 3. Time Window
|
||||
|
||||
Events must be close in time.
|
||||
|
||||
---
|
||||
|
||||
### 4. Streaming Data
|
||||
|
||||
- Data arrives continuously
|
||||
- May be out of order
|
||||
- Cannot store everything
|
||||
|
||||
---
|
||||
|
||||
## Real Engineering Approach
|
||||
|
||||
### Core Idea
|
||||
|
||||
Maintain sliding windows per request_id.
|
||||
|
||||
### Pseudocode
|
||||
|
||||
```
|
||||
for each incoming event:
|
||||
bucket = active_events[event.request_id]
|
||||
|
||||
remove old events outside time window
|
||||
|
||||
for each old_event in bucket:
|
||||
if event.latency + old_event.latency > threshold:
|
||||
report anomaly
|
||||
|
||||
add event to bucket
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Additional Real Constraints
|
||||
|
||||
- Out-of-order events
|
||||
- Missing logs
|
||||
- Duplicate events
|
||||
- Noise filtering
|
||||
- Memory limits
|
||||
|
||||
---
|
||||
|
||||
## Key Takeaway
|
||||
|
||||
Two Sum is not about numbers.
|
||||
|
||||
It is about recognizing patterns in controlled environments.
|
||||
|
||||
Real engineering problems are about:
|
||||
|
||||
- defining valid data
|
||||
- handling imperfect inputs
|
||||
- managing time and memory
|
||||
- maintaining system behavior under constraints
|
||||
|
||||
---
|
||||
|
||||
## Project Perspective
|
||||
|
||||
Exists in real engineering?
|
||||
→ Yes, but heavily transformed
|
||||
|
||||
Exists in interview form?
|
||||
→ Yes, but oversimplified
|
||||
@@ -0,0 +1,223 @@
|
||||
#include <algorithm>
|
||||
#include <cstdint>
|
||||
#include <deque>
|
||||
#include <iomanip>
|
||||
#include <iostream>
|
||||
#include <sstream>
|
||||
#include <stdexcept>
|
||||
#include <string>
|
||||
#include <unordered_map>
|
||||
#include <vector>
|
||||
|
||||
struct Event {
|
||||
std::int64_t timestamp_ms;
|
||||
std::string service;
|
||||
std::string event;
|
||||
int latency_ms;
|
||||
std::string request_id;
|
||||
std::string raw_line;
|
||||
};
|
||||
|
||||
struct PairResult {
|
||||
bool found = false;
|
||||
Event first;
|
||||
Event second;
|
||||
};
|
||||
|
||||
std::int64_t parseTimestampMs(const std::string& timestamp)
|
||||
{
|
||||
// Expected format:
|
||||
// YYYY-MM-DDTHH:MM:SS.mmmZ
|
||||
// For this demo we only convert the HH:MM:SS.mmm part to milliseconds.
|
||||
const std::size_t t_pos = timestamp.find('T');
|
||||
const std::size_t z_pos = timestamp.find('Z');
|
||||
|
||||
if (t_pos == std::string::npos || z_pos == std::string::npos) {
|
||||
throw std::runtime_error("Invalid timestamp: " + timestamp);
|
||||
}
|
||||
|
||||
const std::string time_part = timestamp.substr(t_pos + 1, z_pos - t_pos - 1);
|
||||
|
||||
int hours = 0;
|
||||
int minutes = 0;
|
||||
int seconds = 0;
|
||||
int millis = 0;
|
||||
char colon1 = '\0';
|
||||
char colon2 = '\0';
|
||||
char dot = '\0';
|
||||
|
||||
std::istringstream iss(time_part);
|
||||
iss >> hours >> colon1 >> minutes >> colon2 >> seconds >> dot >> millis;
|
||||
|
||||
if (!iss || colon1 != ':' || colon2 != ':' || dot != '.') {
|
||||
throw std::runtime_error("Invalid time part: " + time_part);
|
||||
}
|
||||
|
||||
return (((hours * 60LL) + minutes) * 60LL + seconds) * 1000LL + millis;
|
||||
}
|
||||
|
||||
Event parseLogLine(const std::string& line)
|
||||
{
|
||||
std::istringstream iss(line);
|
||||
|
||||
std::string timestamp;
|
||||
std::string service_token;
|
||||
std::string event_token;
|
||||
std::string latency_token;
|
||||
std::string request_token;
|
||||
|
||||
if (!(iss >> timestamp >> service_token >> event_token >> latency_token >> request_token)) {
|
||||
throw std::runtime_error("Cannot parse log line: " + line);
|
||||
}
|
||||
|
||||
auto valueAfterEquals = [](const std::string& token) -> std::string {
|
||||
const std::size_t pos = token.find('=');
|
||||
if (pos == std::string::npos || pos + 1 >= token.size()) {
|
||||
throw std::runtime_error("Invalid token: " + token);
|
||||
}
|
||||
return token.substr(pos + 1);
|
||||
};
|
||||
|
||||
Event result;
|
||||
result.timestamp_ms = parseTimestampMs(timestamp);
|
||||
result.service = valueAfterEquals(service_token);
|
||||
result.event = valueAfterEquals(event_token);
|
||||
|
||||
std::string latency_value = valueAfterEquals(latency_token);
|
||||
if (latency_value.size() < 3 || latency_value.substr(latency_value.size() - 2) != "ms") {
|
||||
throw std::runtime_error("Invalid latency token: " + latency_token);
|
||||
}
|
||||
latency_value.erase(latency_value.size() - 2);
|
||||
result.latency_ms = std::stoi(latency_value);
|
||||
|
||||
result.request_id = valueAfterEquals(request_token);
|
||||
result.raw_line = line;
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
std::vector<Event> parseLogs(const std::vector<std::string>& lines)
|
||||
{
|
||||
std::vector<Event> events;
|
||||
events.reserve(lines.size());
|
||||
|
||||
for (const std::string& line : lines) {
|
||||
events.push_back(parseLogLine(line));
|
||||
}
|
||||
|
||||
return events;
|
||||
}
|
||||
|
||||
void printPair(const PairResult& result, const std::string& label)
|
||||
{
|
||||
std::cout << label << '\n';
|
||||
|
||||
if (!result.found) {
|
||||
std::cout << " no pair found\n\n";
|
||||
return;
|
||||
}
|
||||
|
||||
std::cout << " first : " << result.first.raw_line << '\n';
|
||||
std::cout << " second: " << result.second.raw_line << '\n';
|
||||
std::cout << " combined latency: "
|
||||
<< (result.first.latency_ms + result.second.latency_ms)
|
||||
<< "ms\n\n";
|
||||
}
|
||||
|
||||
PairResult interviewStyleReduction(const std::vector<Event>& events, int threshold_ms)
|
||||
{
|
||||
// Intentionally wrong for the real-world problem:
|
||||
// it ignores request_id and time.
|
||||
for (std::size_t i = 0; i < events.size(); ++i) {
|
||||
for (std::size_t j = i + 1; j < events.size(); ++j) {
|
||||
if (events[i].latency_ms + events[j].latency_ms > threshold_ms) {
|
||||
return PairResult{true, events[i], events[j]};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return PairResult{};
|
||||
}
|
||||
|
||||
PairResult realSlidingWindowDetection(const std::vector<Event>& events,
|
||||
int threshold_ms,
|
||||
std::int64_t window_ms)
|
||||
{
|
||||
std::unordered_map<std::string, std::deque<Event> > active_events;
|
||||
|
||||
for (const Event& current : events) {
|
||||
std::deque<Event>& bucket = active_events[current.request_id];
|
||||
|
||||
while (!bucket.empty() &&
|
||||
(current.timestamp_ms - bucket.front().timestamp_ms) > window_ms) {
|
||||
bucket.pop_front();
|
||||
}
|
||||
|
||||
for (const Event& previous : bucket) {
|
||||
const std::int64_t delta = current.timestamp_ms - previous.timestamp_ms;
|
||||
|
||||
if (delta >= 0 && delta <= window_ms &&
|
||||
previous.latency_ms + current.latency_ms > threshold_ms) {
|
||||
return PairResult{true, previous, current};
|
||||
}
|
||||
}
|
||||
|
||||
bucket.push_back(current);
|
||||
}
|
||||
|
||||
return PairResult{};
|
||||
}
|
||||
|
||||
void printEvents(const std::vector<Event>& events)
|
||||
{
|
||||
std::cout << "Synthetic log stream:\n";
|
||||
for (const Event& event : events) {
|
||||
std::cout << " " << event.raw_line << '\n';
|
||||
}
|
||||
std::cout << '\n';
|
||||
}
|
||||
|
||||
int main()
|
||||
{
|
||||
try {
|
||||
const std::vector<std::string> raw_logs = {
|
||||
"2026-04-16T10:15:01.100Z service=api event=parse_input latency=12ms request_id=req-1001",
|
||||
"2026-04-16T10:15:01.110Z service=cache event=cache_miss latency=48ms request_id=req-1001",
|
||||
"2026-04-16T10:15:01.120Z service=auth event=token_check latency=58ms request_id=req-2001",
|
||||
"2026-04-16T10:15:01.130Z service=db event=read_user latency=43ms request_id=req-3001",
|
||||
"2026-04-16T10:15:01.135Z service=db event=read_user latency=55ms request_id=req-1001",
|
||||
"2026-04-16T10:15:01.144Z service=net event=external_call latency=47ms request_id=req-1001",
|
||||
"2026-04-16T10:15:01.200Z service=cache event=cache_miss latency=60ms request_id=req-3001",
|
||||
"2026-04-16T10:15:01.260Z service=net event=external_call latency=52ms request_id=req-3001"
|
||||
};
|
||||
|
||||
const int threshold_ms = 100;
|
||||
const std::int64_t window_ms = 20;
|
||||
|
||||
const std::vector<Event> events = parseLogs(raw_logs);
|
||||
|
||||
printEvents(events);
|
||||
|
||||
std::cout << "Threshold: " << threshold_ms << "ms\n";
|
||||
std::cout << "Time window: " << window_ms << "ms\n\n";
|
||||
|
||||
const PairResult naive_result = interviewStyleReduction(events, threshold_ms);
|
||||
printPair(naive_result, "Interview-style reduction (ignores request_id and time):");
|
||||
|
||||
const PairResult real_result = realSlidingWindowDetection(events, threshold_ms, window_ms);
|
||||
printPair(real_result, "Streaming sliding-window detection:");
|
||||
|
||||
std::cout << "Notes:\n";
|
||||
std::cout << " - The interview-style version can produce a false correlation.\n";
|
||||
std::cout << " - In this dataset, it first matches 58ms from req-2001 with 43ms from req-3001.\n";
|
||||
std::cout << " - That pair exceeds the threshold, but it is operationally meaningless.\n";
|
||||
std::cout << " - The streaming version only correlates events from the same request_id\n";
|
||||
std::cout << " and only within the configured time window.\n";
|
||||
|
||||
return 0;
|
||||
}
|
||||
catch (const std::exception& ex) {
|
||||
std::cerr << "Error: " << ex.what() << '\n';
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
@@ -1,4 +1,6 @@
|
||||
# Analysis #XX — Two Sum Is Not About Numbers
|
||||
# Analysis #03 — Two Sum Is Not About Numbers
|
||||
|
||||
---
|
||||
|
||||
## Problem
|
||||
|
||||
@@ -9,9 +11,10 @@ At first glance, the problem looks trivial:
|
||||
This is one of the most well-known interview questions, commonly referred to as **Two Sum**.
|
||||
|
||||
It is simple, clean, and perfectly defined:
|
||||
- a static array
|
||||
- exact arithmetic
|
||||
- a guaranteed answer
|
||||
|
||||
- a static array
|
||||
- exact arithmetic
|
||||
- a guaranteed answer
|
||||
|
||||
And that’s exactly why it works so well in interviews.
|
||||
|
||||
@@ -21,10 +24,10 @@ And that’s exactly why it works so well in interviews.
|
||||
|
||||
A candidate is expected to go through a familiar progression:
|
||||
|
||||
1. Start with brute force (O(n²))
|
||||
2. Recognize inefficiency
|
||||
3. Optimize using a hash map
|
||||
4. Achieve O(n) time complexity
|
||||
1. Start with brute force (O(n²))
|
||||
2. Recognize inefficiency
|
||||
3. Optimize using a hash map
|
||||
4. Achieve O(n) time complexity
|
||||
|
||||
```cpp
|
||||
unordered_map<int, int> seen;
|
||||
@@ -40,7 +43,7 @@ for (int i = 0; i < n; ++i) {
|
||||
}
|
||||
```
|
||||
|
||||
The “correct” answer is not about solving the problem.
|
||||
The “correct” answer is not really about solving the problem.
|
||||
|
||||
It is about recognizing the pattern.
|
||||
|
||||
@@ -50,35 +53,41 @@ It is about recognizing the pattern.
|
||||
|
||||
Despite its simplicity, this problem evaluates:
|
||||
|
||||
- familiarity with standard patterns
|
||||
- ability to choose a data structure
|
||||
- understanding of time complexity
|
||||
- familiarity with standard patterns
|
||||
- ability to choose a data structure
|
||||
- understanding of time complexity
|
||||
|
||||
But most importantly:
|
||||
|
||||
> it tests whether you have seen this problem before.
|
||||
|
||||
A candidate who has already practiced this family of tasks will likely reach the expected answer quickly.
|
||||
|
||||
A candidate who has spent years solving real engineering problems may still pause — not because the problem is hard, but because the interview expects a very specific kind of answer.
|
||||
|
||||
---
|
||||
|
||||
## A Subtle Shift
|
||||
|
||||
Now let’s take the same idea and move it one step closer to reality.
|
||||
Now let’s move the same idea one step closer to reality.
|
||||
|
||||
Instead of numbers, we have **log events**.
|
||||
|
||||
Instead of a static array, we have a **stream**.
|
||||
|
||||
Instead of a clean equality, we have **imperfect data and thresholds**.
|
||||
Instead of a clean equality, we have **imperfect data, context, and thresholds**.
|
||||
|
||||
---
|
||||
|
||||
## Synthetic Log Example
|
||||
|
||||
```
|
||||
```text
|
||||
2026-04-16T10:15:01.123Z service=api event=parse_input latency=12ms request_id=req-1001
|
||||
2026-04-16T10:15:01.130Z service=cache event=cache_miss latency=48ms request_id=req-1001
|
||||
2026-04-16T10:15:01.135Z service=db event=read_user latency=55ms request_id=req-1001
|
||||
2026-04-16T10:15:01.141Z service=auth event=token_check latency=18ms request_id=req-2001
|
||||
2026-04-16T10:15:01.144Z service=net event=external_call latency=47ms request_id=req-1001
|
||||
2026-04-16T10:15:01.149Z service=db event=read_user latency=22ms request_id=req-2001
|
||||
2026-04-16T10:15:01.151Z service=cache event=cache_miss latency=60ms request_id=req-3001
|
||||
2026-04-16T10:15:01.154Z service=net event=external_call latency=52ms request_id=req-3001
|
||||
```
|
||||
@@ -92,9 +101,9 @@ We are no longer asked to find two numbers.
|
||||
Instead, the problem becomes:
|
||||
|
||||
> Detect whether there exist two events:
|
||||
> - belonging to the same request
|
||||
> - occurring close in time
|
||||
> - whose combined latency exceeds a threshold
|
||||
> - belonging to the same request
|
||||
> - occurring within a time window
|
||||
> - whose combined latency exceeds a threshold
|
||||
|
||||
This still *looks* like Two Sum.
|
||||
|
||||
@@ -102,60 +111,104 @@ But it is not.
|
||||
|
||||
---
|
||||
|
||||
## How LeetCode Thinking Tries to Adapt
|
||||
|
||||
The first instinct is to simplify.
|
||||
|
||||
Take the log stream, ignore most of the structure, extract just the latency values, and reduce everything back to “numbers in an array”.
|
||||
|
||||
That leads to a familiar line of thinking:
|
||||
|
||||
1. Collect latencies
|
||||
2. Search for matching pairs
|
||||
3. Try to reuse the same hash map pattern
|
||||
4. Treat the task as another variation of Two Sum
|
||||
|
||||
This is exactly what interview training encourages:
|
||||
|
||||
> reduce the problem until it matches a known template.
|
||||
|
||||
That works beautifully in interviews.
|
||||
|
||||
But this is also where the model starts to break.
|
||||
|
||||
---
|
||||
|
||||
## Where the Interview Model Breaks
|
||||
|
||||
### 1. No Exact Match
|
||||
### 1. It Is Not an Exact-Match Problem
|
||||
|
||||
Interview version:
|
||||
```
|
||||
|
||||
a + b == target
|
||||
```
|
||||
|
||||
Real version:
|
||||
```
|
||||
a + b > threshold
|
||||
```
|
||||
|
||||
We are not searching for a perfect complement.
|
||||
a + b > threshold
|
||||
|
||||
We are not searching for a perfect complement.
|
||||
We are evaluating a condition.
|
||||
|
||||
---
|
||||
|
||||
### 2. Context Is Mandatory
|
||||
### 2. Context Cannot Be Ignored
|
||||
|
||||
You cannot combine arbitrary events.
|
||||
A latency of 55ms from one request and 52ms from another may exceed the threshold.
|
||||
|
||||
A latency spike only makes sense **within the same request**.
|
||||
But together they mean nothing.
|
||||
|
||||
Without context, the result is meaningless.
|
||||
Without context, the result is technically correct — and completely useless.
|
||||
|
||||
---
|
||||
|
||||
### 3. Time Matters
|
||||
### 3. Time Makes the Problem Harder
|
||||
|
||||
Events are not just values — they exist in time.
|
||||
|
||||
Two events five seconds apart may not be related at all.
|
||||
Two events may belong to the same request and still be unrelated if they are too far apart.
|
||||
|
||||
This introduces:
|
||||
- time windows
|
||||
- ordering issues
|
||||
- temporal constraints
|
||||
|
||||
- time windows
|
||||
- ordering
|
||||
- eviction
|
||||
|
||||
---
|
||||
|
||||
### 4. Data Is Not Static
|
||||
### 4. The Data Is Not Static
|
||||
|
||||
LeetCode assumes:
|
||||
- full dataset
|
||||
- already loaded
|
||||
- perfectly ordered
|
||||
Interview assumptions:
|
||||
|
||||
- full dataset available
|
||||
- stable ordering
|
||||
- perfect input
|
||||
|
||||
Reality:
|
||||
- streaming input
|
||||
- delayed events
|
||||
- missing entries
|
||||
- out-of-order delivery
|
||||
|
||||
- streaming data
|
||||
- out-of-order events
|
||||
- missing logs
|
||||
- duplicates
|
||||
|
||||
The “single clean pass over an array” stops being a valid model.
|
||||
|
||||
---
|
||||
|
||||
### 5. Pattern Matching Becomes a Trap
|
||||
|
||||
The more familiar the pattern, the stronger the temptation:
|
||||
|
||||
> “This is just Two Sum.”
|
||||
|
||||
But in reality:
|
||||
|
||||
- request_id defines grouping
|
||||
- timestamp defines relevance
|
||||
- streaming defines constraints
|
||||
|
||||
These are not details.
|
||||
|
||||
They are the problem.
|
||||
|
||||
---
|
||||
|
||||
@@ -169,21 +222,22 @@ It becomes:
|
||||
|
||||
> “determine which events are comparable at all”
|
||||
|
||||
And that is a fundamentally different problem.
|
||||
The arithmetic is trivial.
|
||||
|
||||
The system is not.
|
||||
|
||||
---
|
||||
|
||||
## Real Engineering Approach
|
||||
|
||||
Instead of solving a mathematical puzzle, we build a system.
|
||||
Instead of solving a puzzle, we build a mechanism.
|
||||
|
||||
### Core Idea
|
||||
|
||||
Maintain a sliding window of recent events per request.
|
||||
Maintain a sliding window per request_id.
|
||||
|
||||
### Pseudocode
|
||||
|
||||
```
|
||||
for each incoming event:
|
||||
bucket = active_events[event.request_id]
|
||||
|
||||
@@ -194,7 +248,6 @@ for each incoming event:
|
||||
report anomaly
|
||||
|
||||
add event to bucket
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -202,17 +255,52 @@ for each incoming event:
|
||||
|
||||
Now we must deal with:
|
||||
|
||||
- bounded memory
|
||||
- streaming constraints
|
||||
- time-based eviction
|
||||
- correlation logic
|
||||
- bounded memory
|
||||
- streaming constraints
|
||||
- time-based eviction
|
||||
- request-level grouping
|
||||
|
||||
And beyond that:
|
||||
And then reality hits:
|
||||
|
||||
- out-of-order events
|
||||
- duplicate logs
|
||||
- partial data
|
||||
- noise filtering
|
||||
- out-of-order events
|
||||
- duplicate logs
|
||||
- partial data
|
||||
- noise
|
||||
|
||||
At this point, the original Two Sum is almost unrecognizable.
|
||||
|
||||
---
|
||||
|
||||
## Demo
|
||||
|
||||
See example implementation:
|
||||
|
||||
- examples/two_sum_logs_demo.cpp
|
||||
|
||||
---
|
||||
|
||||
## Example Output
|
||||
|
||||
Interview-style reduction:
|
||||
combines events from different request_id → false positive
|
||||
|
||||
Streaming solution:
|
||||
finds valid pair within same request and time window
|
||||
|
||||
---
|
||||
|
||||
## Explanation
|
||||
|
||||
The interview-style solution produces a mathematically valid result.
|
||||
|
||||
But it mixes unrelated events.
|
||||
|
||||
The streaming solution respects:
|
||||
|
||||
- request boundaries
|
||||
- time constraints
|
||||
|
||||
Which makes the result meaningful.
|
||||
|
||||
---
|
||||
|
||||
@@ -222,20 +310,20 @@ The difficulty is not in computing a sum.
|
||||
|
||||
The difficulty is in defining:
|
||||
|
||||
- what data is valid
|
||||
- what events belong together
|
||||
- what “close enough” means
|
||||
- how the system behaves under imperfect conditions
|
||||
- what data is valid
|
||||
- what events belong together
|
||||
- what “close enough” means
|
||||
- how the system behaves under imperfect conditions
|
||||
|
||||
---
|
||||
|
||||
## Key Takeaway
|
||||
|
||||
Two Sum is often presented as a problem about numbers.
|
||||
Two Sum is not about numbers.
|
||||
|
||||
In reality, it is a problem about assumptions.
|
||||
It is about assumptions.
|
||||
|
||||
Remove those assumptions, and the problem changes completely.
|
||||
Remove those assumptions — and the problem changes completely.
|
||||
|
||||
> The challenge is not finding two values.
|
||||
> The challenge is understanding whether those values should ever be compared.
|
||||
@@ -245,7 +333,14 @@ Remove those assumptions, and the problem changes completely.
|
||||
## Project Perspective
|
||||
|
||||
Exists in real engineering?
|
||||
→ Yes, but as event correlation under constraints
|
||||
→ Yes, but as event correlation under constraints
|
||||
|
||||
Exists in interview form?
|
||||
→ Yes, but stripped of context and complexity
|
||||
→ Yes, but stripped of context and complexity
|
||||
|
||||
---
|
||||
|
||||
## Final Note
|
||||
|
||||
The algorithm was never the hard part.
|
||||
The assumptions were.
|
||||
|
||||
Reference in New Issue
Block a user