interview questions facebook/google
This commit is contained in:
125
puzzles/interviews/Google Interview Questions.md
Normal file
125
puzzles/interviews/Google Interview Questions.md
Normal file
@@ -0,0 +1,125 @@
|
||||
# Google Interview
|
||||
|
||||
## Coding Interview 1
|
||||
|
||||
[['a1', 'a2', 'a1b'], ['b1', 'b2'], ['c1', 'c2']]
|
||||
|
||||
### problem 1
|
||||
|
||||
print all combinations, like
|
||||
```
|
||||
a1b1c1
|
||||
a1b1c2
|
||||
a1b2c1
|
||||
a1b2c2
|
||||
a2b1c1
|
||||
a2b1c2
|
||||
a2b2c1
|
||||
a2b2c2
|
||||
a1bb1c1
|
||||
a1bb1c2
|
||||
a1bb2c1
|
||||
a1bb2c2
|
||||
```
|
||||
### problem 2
|
||||
|
||||
given a string 'a2b2c1', is it one of combinations from above.
|
||||
|
||||
solved by dynamic programming.
|
||||
|
||||
|
||||
|
||||
## Coding Interview 2
|
||||
|
||||
### Problem 1
|
||||
given a tree (not a binary one)
|
||||
|
||||
```
|
||||
o
|
||||
/|\
|
||||
/ | \
|
||||
o o o
|
||||
/ \ / \
|
||||
o o o o
|
||||
| |
|
||||
o o
|
||||
```
|
||||
Find nodes with similar structure. Like all life nodes are similar.
|
||||
|
||||
My solution was to come up with a signature for each node, like
|
||||
sig node = <num child>,(<sig child 1>),(<sig child 2>),(<sig child 3>) ...;
|
||||
then create a map <sig> -> set of similar nodes.
|
||||
|
||||
if we know that nodes are divers, then we could actually create a hash.
|
||||
|
||||
### Problem 2
|
||||
Represent a number in (-2) base, like a*(-2)^4+b*(-2)^3+c*(-2)^2+d*(-2)^1+e*(-2)^0
|
||||
|
||||
e.g. 1 -2 4 -8 16 -32
|
||||
|
||||
0 = 0000
|
||||
1 = 0001
|
||||
-1 = 0011
|
||||
2 = 0110
|
||||
-2 = 0010
|
||||
3 = 0111
|
||||
-3 = 1111
|
||||
....
|
||||
|
||||
|
||||
## System Design 1
|
||||
|
||||
800 Billions query log entries, 8 machines
|
||||
top million query by frequency
|
||||
|
||||
unique queries? take samples to estimate the unique query, shrinks by 1 factor of ten
|
||||
|
||||
distribute queries evenly by machine
|
||||
|
||||
go over the logs in each machine
|
||||
and create 10B key value pairs (assume 10x reduction)
|
||||
|
||||
For all machines = 4T
|
||||
average query length = 50B
|
||||
|
||||
Data per machine = 500GB of data (This produced from 5T)
|
||||
(use hash map, split data in chunks ~1T or 0.5T
|
||||
Produce about 50GB for each chunks, do this 5-10 time for all data)
|
||||
|
||||
reading from disk is going to be 20h at 200MB/s -> but we have 6 disks -> 3.5h to read everything
|
||||
|
||||
Another way to do this is to sort each chunk (by query), merge those. Produce all 500GB of data.
|
||||
|
||||
idea: merge the data between machines in pairs (1.5h to send to wire)
|
||||
|
||||
Somehow sort 500GB data on each machine and produce 1 M query per machine.
|
||||
(50Meg Data)
|
||||
About 10x50GB chunks. Create another hash count->set of queries.)
|
||||
|
||||
Do chunks have overlapping queries? yes they wiil
|
||||
|
||||
|
||||
Send all data to one machine (400Meg Data)
|
||||
Apply some sort of heap sort and merge presorted arrays.
|
||||
|
||||
|
||||
Correctness of data
|
||||
|
||||
how big are the machines?
|
||||
32 cores of 3 GHz
|
||||
120 G of RAM
|
||||
6x2T each (spinning disk)
|
||||
10Gbps network connection
|
||||
|
||||
you can store 5T in each machine
|
||||
|
||||
how to do the key value pair?
|
||||
have an hash map
|
||||
|
||||
Run Time
|
||||
|
||||
2M querie 100MB data per machine, (1GB data transfered for all) May be another 1GB for the second phase.
|
||||
|
||||
~4-5h to process this data.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user