One of my favorite ‘toy problems,’ often asked in technical interviews during a ‘screening’ phase, is to implement an asynchronous mapping function. This problem can be tricky because it tests your knowledge of asynchronicity and your ability to use callback functions. This knowledge is considered necessary for a web-application engineer — asynchronous programming techniques, event systems, and use of callback functions are standard for building web apps. So, many interviewers use this question to cut to the chase.
The prompt will usually go like this:
Write an asyncMap function with two parameters — an array of asynchronous functions and a callback function. Each of the asynchronous functions takes a its own callback and invokes that callback when finished executing. The callback passed to asyncMap should be invoked on the results of the callbacks of the asynchronous functions. The order of the results should be the same as the order in which the asynchronous functions were invoked. Note, the order in which the asynchronous functions are passed to asyncMap determines the order of the results, not the order in which the asynchronous functions finish executing. After all the callbacks of the asynchronous functions have returned, asyncMap should invoke the callback on the collection of results.
If your head isn’t spinning by now, you shouldn’t be reading this. Go build a compiler or something.
For those of us still here, let’s look at an example of the expected behavior and then dissect the prompt.
Example of Expected Behavior
Each of the asynchronous functions (job1
, job2
, and job3
) use a setTimeout
function to simulate a long-running task.
Based on the example, job2
will return first, job3
will return second, and job1
will return last, but the expectation is that asyncMap
will make it appear as though the results being passed into the final callback were returned in the order of job1
– job2
– job3
. The results will be passed into the callback, which logs the results in synchronous-appearing order, as seen on lines 17 – 19.
We have a clear expectation of the behavior of a single-case, so let’s dissect the prompt to see if we can pick out a more general expectation that handles all cases.
Dissecting the Prompt
- we are in JavaScript-land — the world of the browser, Node, the Event Loop, a single-thread, and web applications
Obviously, you can implement asynchronous code behavior in a language that can handle multi-threading, but the prompt will be framed very differently. If you are being asked this question in an interview, it is because you are going to be working with the JavaScript in some capacity if they offer you the job.
asyncMap
will take two inputs — an array and a callback function
The array argument will be full of asynchronous functions, and the callback function argument will be an action to perform when all of the functions in the array have finished executing.
Each member of the array can be thought of as a job that needs to be performed. The job will take time to complete, and because JavaScript runs on a single-thread, the job will block the event loop of the JavaScript interpreter if it is executed synchronously.
Instead, each job should be started and given instructions on what to do when the job is completed, and then the JavaScript interpreter can continue processing without needing to wait for each job to complete before even starting the next job. When each job finishes, it can use the instructions it was given to callback the main thread with the result of the job’s work. This is one way to implement asynchronicity — another popular style is to use Promises instead of callbacks.
Don’t get confused by all the callbacks floating around. Our asyncMap
function will take a single callback, while each of the functions within the input array will take its own callback, which will be different from the callback passed into asyncMap
as the second argument.
Once all of the jobs have completed and triggered their individual callbacks, asyncMap
should collect up all of the results from the jobs — the results should be ordered based on when the job was started, not when the job returned it’s value. Finally, the callback that was passed into asyncMap
should be invoked with the results as an argument.
asyncMap
should effectively simulate a synchronous mapping function, i.e., even though the collection being enumerated over contains functions that will be executed asynchronously, the result of our asyncMap
function will appear as though all of the jobs were executed synchronously.
Now that we have a target for which we can aim, let’s cover the concepts and techniques that will be involved in our algorithm, and then we will cover two possible solutions to the problem.
Concepts and Techniques Involved
The interview question is testing you on your ability to comprehend and use the following:
- asynchronous programming techniques
- callbacks
Before discussing two possible solutions, I want to explain how I see asynchronous programming and callback functions to be connected. If you feel confident that you could explain the asynchronous processing model and callback functions, feel free to skip to the solutions section.
The Asynchronous Processing Model and Callback Functions
Think about a fast-food drive-through. Imagine how stupid it would be to run a drive-through in the following way: one driver approaches the speaker and provides an order to the attendant through the speaker. Then, the driver waits in front of the speaker — effectively, blocking any other car from placing an order. Now, the fast-food restaurant hurries to prepare the order while everyone waits. Finally, a few minutes later once the order has been prepared, the attendant calls the driver forward to a window, where the driver receives the food, pays the attendant, and drives away. Only now can the next driver approach the speaker and place an order. The process repeats.
That would be a synchronous drive-through model. That is not how drive-through lines actually operate — drive-through lines are based on an asynchronous model of processing instructions.
It is actually highly intuitive for humans, minus all the vocabulary mumbo-jumbo. Let’s drive the analogy home by connecting the terminology to our intuitions. The drive-through is a single-instruction stream in that only one instruction can be executed at a time — the drive-through attendant can only take one order at a time and the drive-through line is setup for handling a single car at a time.
The drive-through can only handle one input at a time, and it takes time to produce the output after the input is given (they have to prepare the order). Similarly, computing I/O, input/output — reading and writing operations — usually take a long time, e.g., writing to a database or reading from a remote web server. If you have a single-instruction stream, as in a JavaScript runtime, your instruction stream will become blocked if you start an I/O operation and wait to perform more instructions until the I/O operation is completed.
The asynchronous programming model avoids the blocking by only starting the potentially blocking I/O operation and then continuing to process the next instruction instead of waiting for the I/O operation to complete. But, before continuing to process instructions, the potentially blocking I/O operation is started with some extra information. The operation must be sent off with instructions on how to callback to the single-instruction stream with the result of the operation.
Once the operation starts and the callback instructions are given to the operation, the single-instruction stream can move on to handle the next incoming instruction without worrying about the result of the previous operation. The previous operation is responsible for calling back the instruction stream with the result of the operation as soon as it’s completed.
To beat the dead horse, the drive-through attendant sends off a potentially blocking instruction to be handled by other staff members. The drive-through attendant has told the other staff members how to notify her when the order is ready for pickup. The drive-through attendant then receives another incoming order and sends it off for preparation. As soon as one of the orders is ready for pickup, the staff members will use the notification instructions to notify the attendant that the order is complete. The staff members can then pass the completed order to the attendant, who can use it to transact with the customer. There is very little time in that process where the instruction stream is blocked. Welcome to asynchronicity.
Congratulations, you are ready for the solution reveal…
Solutions
I lied. You aren’t ready. Or, maybe I’m not ready. I am going to offer two different possible solutions to asyncMap
, and they represent two different styles of programming that are often discussed in technical interviews. One of the solutions is the ‘traditional,’ imperative solution, and the other solution is the currently trending functional solution.
If you could already offer a succinct and opinionated response to an interviewer regarding the similiarities, differences, and tradeoffs associated with the imperative programming paradigm and functional programming paradigm, then feel free to skip to the actual solutions below. If not, here is some theory and opinion-hot-fire for you.
Imperative Programming and Functional Programming
Imperative programming is a paradigm or style of programming that uses imperative statements to change the program state — there is an emphasis on transparency forced by lower-level imperative statements.
Functional programming is a paradigm or style of programming that uses function evaluation to manage program state — there is an emphasis on immutable data and avoidance of changing the program state by using higher-level functional abstractions.
This is not a mutually exclusive categorization; also, there is no principled distinction that one can draw between the two paradigms. So, think of these as loose heuristics for talking about general patterns of programming. That being said, there are noticeable differences in the code!
So, why is the functional style trending? Why do people love it? Why do people hate it?
All for the same reason: abstraction.
The main difference between the imperative and functional styles of programming (heuristically, please don’t shoot me, Mr. Computer Scientist), is the level of abstraction.
The traditional imperative paradigm uses ‘lower-level’ and ‘transparent’ imperative statements to direct the program to compute in a specific way. The traditional functional paradigm uses ‘higher-level’ and ‘abstracted’ function evaluations to direct the program to compute in a specific way. We see this in our examples above — the imperative style uses statements to direct the program, while the functional style uses a function, forEach
to direct the program.
The functional style is prefered (often) because it allows the programmer to think like this:
- use
forEach
to give me each member of the collection, one by one - call the function I give you once for each member of the collection — pass the collection member into my function each time so that I can access it
The imperative style forces the programmer to keep track of a lot more, like this:
- use a
for
loop, starting at a variable calledi
i
should store the value0
at the beginning of thefor
loop- the
for
loop should continue as long asi
is less than the length property of thecollection
- every time through the
for
loop,i
should be incremented by1
- find the current member of the collection by indexing into the collection at the index
i
- perform some work with the current member of the collection that was just found via index
Notice, the functional style implementation uses a callback, which is passed into forEach
. That callback is invoked on each item from the collection, in turn. The abstraction ends up giving you an interface where you can say, ‘hey, forEach, give me each item of the collection one by one, and do whatever I tell you with each item, in turn.’ You don’t have to have any idea how forEach
is working, you just have to know its public interface.
The imperative style is very different. You know exactly how you are getting access to each item in the collection. You know, because you are directing the computer to perform each of the small pieces of work required to iterate through a collection. You know where you are starting the iteration (var i = 0
), how long to continue the iteration (as long as, i < collection.length
), and how big of a step each iteration is taking (i++
).
By encapsulating the iteration logic within the forEach
method, and abstracting away some of the differences between iterating through collections, the functional style allows a user to think about the business logic and not the logic of iteration. Also, the encapsulation and abstraction that forEach
provides offers protection from bugs that are often introduced with repeating any type of lower-level logic.
There are a lot of subtleties being trampled, but that should be enough to be getting along with. Let’s start with a ‘traditional’ imperative solution. I will explain the code directly after, before proceeding to the functional style of the solution.
Solution: imperative
- line 2: declare a
results
variable and initialize it to an empty array- this array will store the return values from all of our
jobs
— theresults
must be ordered based on the order of the originaljobs
array, not the order that thejobs
complete processing
- this array will store the return values from all of our
- line 3: declare a
finished
variable and initialize it to0
- this variable will keep track of how many of the
jobs
have finished —finished
gets incremented every time one of thejobs
calls back with the result of its work
- this variable will keep track of how many of the
- lines 5 – 14: iterate the length of the
jobs
array- this iteration will allow us to trigger all of the
jobs
one by one
- this iteration will allow us to trigger all of the
- lines 6 – 13: create an immediately-invoked function expression (IIFE)
- this is where it gets cool, but it also requires an aside …
ASIDE
An immediately-invoked function expression, or IIFE (pronounced ‘If – E’), is exactly what it sounds like — a function expression that is invoked immediately after it is defined. Read this code and try to understand what is happening before I explain it.
Seriously, an IIFE is just a function that has ()
immediately following the end of the function definition, which invokes the function that was just defined. The IIFE is a very common and useful pattern because it allows you to open a new scope and then immediately destroy the scope by invoking the function, with the possibility of storing the return value of the function without storing anything else in the scope. This is used often for simulating ‘private variables’ in JavaScript — the IIFE allows you to maintain a variable’s state by protecting it with a new scope, while still having access to the variables from the outer scope. We used an IIFE in our solution for that very reason.
END ASIDE
Ok, here is the imperative solution again, just to refresh the mind before we pick up again on line 7.
- line 7: within our IIFE wrapper, we are grabbing the current
job
withjobs[i]
i
is referring to the parameter in the function signature on line 6, which is actually being passed into the invocation of the IIFE from thefor
loop on line 13- we must do this to capture the current value of
i
because the iteration of thefor
loop will happen faster than each of thejobs
, so by the time the firstjob
is trying to store its result value at indexi
on line 9, the value ofi
will be the length of the array because the iteration will have completed all of its loops
- lines 8 – 12: invoke the current
job
and pass in a callback that will handle the result of thejob
- the result of the
job
is passed to the callback asval
on line 8- remember, each
job
must trigger its callback with the return value of its work — the return value of thejob
will be passed into the callback asval
on line 8
- remember, each
- line 9: assign the result
val
of thejob
to the correct position in theresults
collection- the position at which we are storing the
val
is the index number which is based on when thejob
was started
- the position at which we are storing the
- line 10: increment the
finished
count, indicating that anotherjob
has finished its work - line 11: if the number of
jobs
that havefinished
is equal to the length of thejobs
collection, then invoke theasyncMap
callback withresults
of all thejobs
- if the
finished
count is equal to the number ofjobs
in thejobs
collection, then we can assume that all of thejobs
have stored their result values in theresults
array — once all thejobs
have called back with their values, we know that theresults
array is complete and ready to pass back to theasyncMap
callback function
- if the
- the result of the
That is our complete imperative-style solution to asyncMap
. The trick to handling the asynchronous ordering is to create a new scope and capture the original start index number of the job
with an IIFE.
I love that solution, especially since it reminds me of beautiful third-party libraries like jQuery, but I also hate having to remember to wrap the work inside the for
loop within an IIFE. To avoid dealing with these ‘lower-level’ details of managing scope and race conditions, we can abstract away those lower-level details and wrap them up in a function so that we don’t have to remember the details each time. Instead, we can just remember the interface of our new function, and let it do all the work for us.
I am sure that you have inferred where I am going with this…
Solution: functional
Before I go through this solution line-by-line, I want to warn you that I have tweaked the logic on this implementation sligthly relative to the imperative-style solution. The differences in logic are the direction that finished
moves and the condition of the base-case. In the imperative version, finished
starts at 0
and is incremented each time a job
finishes, and the base-case is reached when finished
is equal to the length of the jobs
collection. In the functional version, this logic is reversed. finished
starts at the length of the jobs
collection, and each time a job
finishes, we decrement the finished
count. The base-case checks if the finished
count is 0
, indicating that all the jobs
have completed.
- line 2: declare a
results
variable and initialize it to an empty array- this array will store the return values from all of our
jobs
— theresults
must be ordered based on the order of the originaljobs
array, not the order that thejobs
complete processing
- this array will store the return values from all of our
- line 3: declare a
finished
variable and initialize it tojobs.length
- this variable will keep track of how many of the
jobs
have finished —finished
gets decremented every time one of thejobs
calls back with the result of its work
- this variable will keep track of how many of the
- lines 5 – 10: iterate over the
jobs
array withjobs.forEach
forEach
is the abstraction function mentioned above — it hides all of the iteration logic from us, so we only need to know the public interface offorEach
. It is a invoked on a collection.forEach
takes a callback parameter that is invoked once for every item in the collection.forEach
passes each item into the callback, along with the index number of that item. We can say, ‘hey, give me all the jobs one by one along with the index number that corresponds to their index in thejobs
collection, and do whatever I specify in the callback for eachjob
in the collection.’ We don’t need to worry about scopes and race conditions because ourforEach
abstraction worries about that for us!
- line 6: invoke the current
job
and provide a callback instruction to thejob
that specifies how to handle the result of thejob
- line 7: store the result of the
job
’s work atresults[i]
-within ourjob
callback, thejob
has finished its work and provided the result of the work to the callback in the form ofresult
, so we can store it in theresults
collection - line 8: if the
finished
count decremented by1
is equal to0
, callback toasyncMap
with the finalresults
of all thejobs
in the correct order- the decrementing happens first, before the equality check, so we can decrement and check all in one step
That is the complete solution to our functional-style solution to asyncMap
— terse and simple. All of the hard work of this prompt is handled for you if you use the native forEach
that is standard in ECMAScript 5+. If it feels like cheating to you, remind yourself how silly that attitude is — if utilizing a helpful abstraction is cheating (and you think cheating is bad), then you shouldn’t be using a computer or the English language 😉
Wow, that was a lot of material — let’s summarize.
Summary
Callbacks
A callback is just a group of instructions bundled into a function, which is passed into another function. The callback can get executed at a precise time of the developer’s choosing, usually after the function that is accepting the callback finishes its operations.
The Asynchronous Processing Model
This model processes instructions of a certain type (usually costly operations like I/O) asynchronously. The processor starts to process those instructions’ operations, and then hands those operations an instruction of how to return the result when complete; then, the processor continues to the next instruction without waiting for the prior instructions’ operations to complete. As soon as any operations complete, they can use the instructions they were provided at start to callback to the processor with the result. This model allows the processor to avoid getting blocked by costly operations.
Imperative Programming and Functional Programming
These are both styles or paradigms of programming, with different commitments to how computation should be represented in code. The imperative-style of programming is committed to representing computation in the form of short, language-level, imperative statements. The functional-style of programming is committed to representing computation in the form of function abstractions over language-level features.
Now go use callbacks to implement your functional-style asynchronous apps! If you finish anything cool, call me back. I have other things to process,