Slide 2: Last lecture, we introduced the Post Correspondence Problem. Assuming this problem to be undecidable, we showed that ambiguity of context-free grammars is also undecidable. Now we will show the Post Correspondence Problem itself to be undecidable. Slide 3: Recall that in the Post Correspondence Problem, we are given a finite set of tiles, each tile containing a top string and a bottom string, and we have to decide if we can use the give set of tiles to form a sequence, so that the top string is the same as the bottom string. We also allow each tile to be used more than once. To show this problem to be undecidable, we will reduce from the Turing machine acceptance problem. We will only focus on the high level ideas of the reduction, leaving out some minor details. Slide 4: In the reduction, we reduce from the question "Does a given Turing machine M accept an input string w?" to the question "Does a collection of tiles T contain a match?". We need to make sure the answer to the former is yes, then the answer to the latter is also yes, and vice versa. The reduction again uses computation history method. We will construct the tiles, so that any top and bottom match corresponds to an accepting computation history of Turing machine M on input w. So the matching string looks like this on the slide, and the tiles look like those at the bottom of the slide. Slide 5: First, let's assume for the moment, we have the ability to force one of the tiles to be the first tile in a match, and we call this tile the starting tile. The starting tile has the empty string as its top strong. Its bottom string is the pound sign # followed by the initial configuration of the Turing machine M on input string w. In the example at the top of the slide, the input string w is "ab%ab". We will then construct a few additional tiles for the Post Correspondence Problem, as described on the next slide. Slide 6: Here are the four types of tiles. First, there is the starting tile, encoding the initial configuration of the Turing machine M on input string w. Second, there are tiles for valid transition windows between configurations for the Turing machine M. Third, for technical reasons, there are tiles that add blank spaces before the pound symbol #, because sometimes it may be necessary for the read-write head of the Turing machine to move pass all non-blank symbols on the tape during the computation. These three types of tiles make up all but the last part of the top and bottom match. When using these three types of tiles to construct a tile sequence, the bottom string is always ahead of the top string by one configuration. For the starting tile, the bottom string contains the initial configuration, while the top string is empty. The bottom string keeps leading the top string by one configuration. But if we want a top and bottom match, then the top string has to somehow catch up with the bottom string. We need tiles whose the bottom string is shorter than the top string. These are exactly the fourth type of tiles. They make up the final part of the top and bottom match. Slide 7: During the final part of the match, the accepting configuration will appear as multiple copies, and each copy has one fewer symbol than the previous one. In the example on the slide, the accepting configuration is "xx%x q_a x", the next copy has the final "x" removed, then the next copy has the "x" before q_a removed, and so on. Symbols keep being removed from the accepting configuration, until only the accepting state q_a is left, and even this is also removed. The string at the center of the slide corresponds to the final part of the match. We have the fourth type of tiles that allow the accepting state q_a at the top string to "eat up" an adjacent symbol. That is, the bottom string of these tiles do not have that symbol adjacent to q_a. We also have one special tile, intended to be the last tile in the match, for the final occurrence of q_a in the top string. This tile has q_a followed by two pound symbols as the top string, and a single pound symbol as the bottom string. Slide 8: The fourth type of tiles ensure that any top and bottom match must correspond to an accepting computation history. If Turing machine M rejects w, then we have a rejecting computation history as a partial match. The rejecting state q_{rej} appears on the bottom at some point, but it cannot be completed into a match with the top, because the bottom string is always longer than the top string, when using only the first three types of tiles. If the Turing machine M infinite loops on w, then we cannot form a match using a finite sequence of tiles. Slide 9: We assume earlier that we can force one of the tiles as our starting tile. That is, a match is forced to begin with that tile. We will simulate this assumption, by changing our tiles a bit. Suppose these are the four tiles in our "Post Correspondence Problem With Starting Tile", and the first tile is the starting tile. To simulate the effect of a starting tile, our actual Post Correspondence instance will consist of these tiles below. We will add a special symbol * to the strings. There is one tile, corresponding to the starting tile, where we add * before every symbol in the top string and every symbol in the bottom string. The bottom string also ends with an extra *. Next, for every tile in our "PCP with Starting Tile", we introduce a tile in our actual PCP, where we add * before every symbol in the top string, and we add * after every symbol in the bottom string. These are the "middle tiles". Finally, we add one tile to our actual Post Correspondence Problem, whose top string is * followed by another special symbol blank, and the bottom string is just this special symbol blank. This is the "ending tile". Slide 10: I argue that the "Post Correspondence Problem With Starting Tile" has a match if and only if the actual Post Correspondence Problem has a match. In the actual PCP, the first tile in a match can only be the "starting tile" we have constructed, because it is the only tile where the top string and the bottom string begins with the same symbol, namely *. These extra *s do not affect the possible matches, because they interleave the original symbols. The original symbols appear in even positions of the match. The "ending tile" is the only possible final tile in a match, since it is the only tile whose top and bottom strings end with the same symbol, namely blank. This conclude all the high level ideas for the reduction from the Turing machine acceptance problem to Post Correspondence Problem. Slide 11: We now move on to the quarter of this course: Polynomial time algorithms. Slide 12: In the third quarter of this course, we looked at those problems that can be solved by a computer, assuming the computer can take as much time as needed. In reality, when we want to solve a problem using a computer, we don't just want the problem be solved eventually, but it should be solved quickly. When you do a Google search, you don't want to see the results in 1000 years. You want them within a second. Slide 13: We have classified problems that can be solved by a computer as decidable problems. Other problems are undecidable. For example, Post Correspondence Problem and Turing machine acceptance problem are undecidable. Slide 14: Within decidable problems, we can further classify problems as quickly solvable. To measure the time it takes to solve a problem, we compare the time the algorithm takes with respect to the input size. Slide 15: The running time of a Turing machine M or an algorithm is the function T_m(n). It is the maximum number of steps that M takes on any input of length n. For example, recall the problem of checking whether an input string consists of two parts (separated by the pound sign) are the same. One algorithm to solve this problem is shown on the slide. This algorithm keeps matching and crossing out a symbol before the pound sign with a symbol after the pound sign, making sure they are the same. What is the running time of this algorithm? The second to fourth line is a loop. Each iteration of the loop may require going from the start of the tape to look for an uncrossed symbol before the pound sign, and go pass the pound sign to match with an uncrossed symbol after the pound sign. So each iteration of the loop body may take order n steps, where n is the length of the input string. The for-loop can be executed for order n times, because there are order n many pairs of symbols to cross out. So the overall running time of the for-loop is order n square. This dominates the overall running time, and the running time of the algorithm is order n square. Slide 16: Sometimes the same problem can be solved with different algorithms, with different running times. Consider the problem of checking whether the input string consists of a number of zeros followed by the same number of ones. Then we have a loop, where in each iteration we cross out a zero and a one. Each iteration of the loop takes order n time, with a pass from start to end. The loop goes on for order n times. So the overall running time is order n square. Slide 17: Here is a faster algorithm. We modify the loop as follows: In each iteration of the loop, we find out the parity of the number of zeros. We also find out the parity of the number of ones. If the parities do not match, then we reject. Otherwise, we cross out every the other zero. We also cross out every other one. Then we repeat the above loop, until either the algorithm rejects, or all the symbols have been crossed off. What is the running time of this algorithm? Each iteration of the loop can be carried out in order n time, because finding out parity of zeros, parity of ones, and crossing out zeros and ones can all be done by a pass from start to end. The loop run for at most roughly log n times, because the number of uncrossed out symbols is halved. So the overall running time is order n log n. Slide 18: But if we try to solve the same problem on a different machine, we can get an even faster algorithm. In the past few slides we consider solving the problem on a single-tape Turing machine. What about solving the same problem on a two-tape Turing machine? With the extra storage on the two-tape machine, we can copy all the zeros to the second tape. Then we can move the read-write heads of both tapes together to make sure there are as many zeros as ones. So we can solve the same problem in order n time, even faster. Slide 19: Again, we can consider yet another model of computation. This time let's look at Java. Using a simple for loop over the input array, we can check whether there are as many zeros as ones in order n time. The point is that running time of the best algorithm for solving the same problem may depend on the model of computation. For 1-tape Turing machine, it is order n log n. For 2-tape Turing machine or Java, it is order n. Slide 20: Note that when we say the running time of an algorithm is some function T, it actually has slightly different meanings for different computational models. That's because one time unit on different models has different meanings. For Java, one time unit may mean a conditional check, or an arithmetic operation such as addition or multiplication. For random access machine, which you can think of as assembly language in a typical computer, one time unit may allow you to write to a register. For Turing machine, one time unit allows you to go from one state to another, rewriting the content under the head, and shifting the head by one tape unit. Slide 21: Many lectures ago we talked about Church-Turing thesis. It says all reasonable computing models can solve the same problems, given sufficient time. Now are are looking at problems that can be solved quickly. Would different computation models have different power in solving problems efficiently? Slide 22: Many decades after Church and Turing proposed their thesis, Cobham and Edmonds proposed an extension to the Church-Turing thesis, saying that any two realistic models of computation can simulate each other with at most a polynomial slowdown. For example, if an algorithm can run in time t(n) on some model, say 2-tape Turing machine, then it can be simulated in time order t^2 on another model, say single-tape Turing machine. Slide 23: For the rest of this course, we only care about whether a problem can be solved in polynomial time. That is, whether it can be solved with an algorithm with running time order n^3, or order n^100, or n to any fixed power. So even though different models can simulate the same algorithm with different running time, 1-tape Turing machine usually being very slow while Java being very fast, by Cobham-Edmonds thesis, you can ignore the polynomial-time overhead in simulation. Any problem you can solve in polynomial time in one model, say Java, can also be solved in polynomial time in another realistic model, say Turing machine. Slide 24: Recall a few lectures ago, we discussed how to simulate a multitape Turing machine using a single-tape Turing machine. That simulation supports the Cobham-Edmonds thesis. Consider simulating a 2-tape Turing machine M using a single-tape machine S. If an algorithm runs in time t(n) on the 2-tape machine, how long does it take to simulate the algorithm on a single-tape machine? Slide 25: Each move of the multitape Turing machine M may require traversing the whole tape on S. If s is the rightmost cell ever visited by the single-tape machine S, then simulating one step of M takes O(s) time. After simulating t steps of M, s is at most n + 2t + constant, where n is the input length, 2t is an upper bound on the combined length of contents on the two tapes on M, and the constant denotes whatever extra symbols we need to separate the tapes' contents on M when storing on the single tape on S. Altogether, the rightmost cell ever visited is at most some constant times (n + t). So simulating all t steps of M on S takes time order t times s, which is big Oh of t(n+t). If the algorithm running time t(n) is at least n, then the simulation takes time O(t^2). So if the original algorithm runs on the 2-tape machine in time n^4, say, then it can be simulated on the single-tape machine in time order of n^8, which is still polynomial time. There is at most a quadratic slowdown in the simulation. Slide 26: When you try to simulate any realistic computational model with another one, you always see at most polynomial slowdown. And that's Cobham-Edmonds thesis. I should mention that Cobham-Edmonds thesis was proposed before quantum computers were considered. Quantum computers are now believed to be a model of computation that can solve certain problem much faster than classical computers, namely factoring and breaking RSA. But quantum computers are still not really useful today, and it's another story. Slide 27: We now make an important definition. We define P as the class of languages (or problems) that can be decided on a (single-tape) Turing machine with polynomial running time. So any problem that can be solved in time n^3, or n^100, or n to any fixed power. Problems in P are considered efficiently solvable. It may sound crazy that we consider algorithms with running time order of n^100 as efficient. These algorithms can take trillions of years to run even on inputs of moderate size, say n = 5! However, as we will see in the next lecture, there are many problems that we can only solve in exponential time, say in time 2^n. By comparison, polynomial running time such as n^100 is better than time 2^n as n grows. By Cobham-Edmonds thesis, this means P also represents the class of problems that can be decided by any realistic model of computation, such as Java, Random Access machines, multtape Turing machines, etc. Slide 28: Here are some examples of problems in P. Consider the language L_{01} on the slide. It corresponds to the problem of checking whether the input string consists of a sequence of zeros, followed by a one. This problem can be easily solved in polynomial time, in fact, linear time. As we will see, given a fixed context-free grammar G, checking whether an input string w can be generated by the grammar can be solved in polynomial time. So the language L_G here is also in P. This implies the class of context-free languages is a subclass of P. So we have the containments shown on the slide. P contains all context-free grammar, while P is a subclass of all decidable languages. Slide 29: Why is any context-free language also in P? Because we have the CYK algorithm. This dynamic programming algorithm runs in time order of n^3: there are order of n^2 many cells in the table, and every time we try to fill in a new entry in the table, we need to look at order n existing entries. Slide 30: Let's consider another problem: Given an input graph G and two special nodes s and t, check whether there is a path from s to t. This problem can be solved in linear time on a real computer (or a random access machine), using the DFS or BFS algorithm you learn from Data Structure course. Without random access, you can also solve it in time order of m times n on a single-tape Turing machine. In any case, the running time is polynomial in the input length. So the language PATH defined on this slide is in P. Slide 31: Are there problems not polynomial time solvable? We suspect the following problem is one of them. This problem looks very similar to PATH on the previous slide. It is called Hamiltonian path. In this problem, we are given an input graph G and two special nodes s and t, and we want to check whether there is a path from s to t that visits every node exactly once. For instance, in the graph shown on this slide, there is an Hamiltonian path from s to t, because this path visits every node in the graph exactly once. We don't know whether this problem is in P, and we believe it is not. We will mention evidence in the next lecture, when we introduce NP and NP-completeness.