Assume all memory accesses are cache hit. 20% of the Load instructions are followed by a dependent computational instruction, and 30% of the computational instructions are also followed by a dependent computational instruction. 20% of the branch instructions are unconditional, while 80% are conditional. 40% of the conditional branches are taken, 60% are not taken. The penalty for taking the branch is one cycle. If data forwarding is allowed, what is the instruction throughput for pipelined execution?