Theory of Computation

CS 139-0
Drake University
Spring, 2006






Basic information

Instructor: M. Q. Rieck
Office: Howard 219
Phone: 271-3795
E-Mail: Michael.Rieck@Drake.edu
Homepage: www.drake.edu/mathcs/rieck (to get to ongoing course info)
Office hours: MW 4-6, TR 11-12, and by appointment

Text: M. Sipser, Intro. to the Theory of Computation, 2nd ed., PWS Publishing, 2005



Course objectives

From the catalogue: Theoretical foundations of computing. Introduction to formal grammars, languages and automata theory. Mathematical analysis of the fundamental power and limitations of computing devices. Applications to pattern matching, problem specification, programming languages and compilers.

As an example of a practical application, we will investigate the eXtensible Markup Language (XML), which is becoming increasingly important in web development. An XML programming project will be assigned. We will also study complexity issues, and in particular, the P and NP classes.



Policy

Please check messages at the course web site regularly.

All assigned work should be submitted on time. Late assignments will not be accepted.

Anticipated absences from exams must be approved by the instructor well in advance. Emergency absences should be supported by documentation with the name and phone number of a professional involved in the emergency.





Grading

    Final Exam (Wed., May 10, noon)         24%    
    1st Segment Test (Mon., Feb 13)         12%    
    2nd Segment Test (Mon,, Mar 13)         12%    
    3nd Segment Test (Mon., Apr. 17)         12%    
    Three Projects combined         30%    
    Homework and participation         10%    




Homework Assignments

    Chapter 0    
    Exercises 0.1 to 0.11    
plus
  • Prove by induction that given any natural number m, there exists a connected graph with m vertices and m-1 edges. Be rigorous.
  • Prove that given any connected graph G, there is some vertex of G whose removal from G results in a connected graph. Be rigorous.
    Due: 1/27    
    Chapter 1 (Batch 1)    
    Exercises 1.1 to 1.10    
    Due: 2/10    
    Chapter 1 (Batch 2)    
    Use the Pumping Lemma to CAREFULLY PROVE that each of the following languages is NOT regular:    
  • The set of strings of the form 0m1n0m where m and n are non-negative integers.
  • The set of strings of the form wzw , where w and z are strings over the alphabet { a , b } .
  • The set of strings consisting of 2n copies of the letter a , where n is any non-negative integer.
    (I think that's what I wrote on the board. Let me know if not. Also, "prove" means you need to communicate a logical argument in words, not just scratch down a bunch of symbols.)
    Due: 3/3    
    Chapter 2    
    Exercises 2.3, 2.4, 2.6, 2.8, 2.9 (same in both editions)    
    Due: 3/8    



Programming Projects

  1. Project One     (due 3/3)
  2. Project Two     (due 4/3)




Examples and Old Exams

  1. First Exam from Spring 2004
  2. Second Exam from Spring 2004
  3. Examples of deterministic automata with specified languages (page 1)
  4. Examples of deterministic automata with specified languages (page 2)
  5. Examples of nondeterministic automata with specified languages (page 1)
  6. Examples of nondeterministic automata with specified languages (page 2)
  7. Subset construction example



Homework and Exam Solutions




Links to related web material




Newsworthy messages (check regularly)

  1. Chapter Zero homework is now posted.

  2. On the last Monday of each month, I will not begin my office hours until 4:30, due to a department meeting.

  3. When it comes to "proofs," the devil is certainly in the detail. One tiny mistake can mean that you have an incorrect proof, and in fact the theorem might be wrong! Here is a detail I skipped in my first proof. Remember that G is chosen to be a counter-example with the fewest possible vertices. I pointed out that G cannot be the empty graph (remember why?). But I forgot to also point out that G cannot be a graph with just one vertex (and no edges). Since G is connected, it is now clear (and correct) to say that the degree of each vertex is at least one.

    Concerning my second "bullet problem" from the homework, let me give you a hint as to how I would do it. Assume false. Let G be a counter-example with the fewest possible vertices. So there is some vertex whose removal from G disconnects the graph. Let v be such a vertex, but choose one whose removal from G produces a connected component with the fewest possible vertices. Then focus your attention on that connected component.

  4. It was correctly pointed out after class that it was unnecassary to get involved with the "contrapositive" during today's proof. That said, I wish to address three issues here:

    • Make SURE you appreciate the equivalence of the following

      • p => q
      • ~p v q
      • q v ~p
      • ~(~q) v ~p
      • ~q => ~p

      Consider for example, the conditional statement "if you are taking CS 139, then you are learning about automata." This means the same thing as "either you are not take CS 139 or you are learning about automata (or both)." That doesn't necessarily feel comfortable at first, but you must get used to it! The conditional statement will be true if you are not taking CS 139, regardless of whether or not you are learning about automata. (This is perhaps because it is generally agreed that propositions are true unless they have to be false. If you aren't taking CS 139, anything I assert predicated on the condition that you are is true!) On the other hand, the conditional statement will be true provided that you are learning about automata, regardless of whether or not you are taking CS 139. That's it. But let's consider the contrapositive: "if you are not learn about automata, then you are not taking CS 139." This is logically equivalent to the original conditional statement.

    • I want to show you the most direct proof by induction that Letzter and I could come up for the "graph theorem." Here goes:

      Theorem: For any natural number n, any connected graph with precisely n vertices has at least n-1 edges.

      Proof. Clearly OK when n = 1. Now suppose true for some particular natural number k (i.e. n=k). We need to show that from this assumption it would necessarily also be true when n = k+1. To do so we need to consider an arbitrary connected graph G with k+1 vertices. From a HW assignment we know that G has at least one vertex v whose removal yields a connected graph H (with k vertices). Note that k is at least one, so H is not the empty graph. From our assumption (the "inductive hypothesis"), H must have at least k-1 edges. Going back to G, we know that G has at least one edge incident with v. This edge is not part of H. So G has at least k edges (=1+(k-1)). Good. We have thus established that if we're OK for n=k, then we are also OK for n=k+1. (Using formal notation used in class, P(k) => P(k+1).) By the Principal of Mathematical Indication, the theorem now follows (for all natural numbers n, P(n)). []

    • Here is my solution to a HW exercise. You were essentially asked to prove that in any connected graph there is always at least one vertex whose removal keeps the resulting graph connected. Here is my proof:

      Suppose this is false. Let G be a minimal counter-example. That is, let G be a connected graph with as few vertices as possible such that the removal of any of these vertices disconnects the graph. Now, let v be a vertex of G whose removal produces a connected component with as few vertices as possible (as per my hint). Let H be this connected component. Since H has fewer vertices than G, it must contain a vertex u whose removal from H results in a connected graph K. Now consider K as part of G. Is it possible that no vertex in K is adjacent to v? If this were so, then K would be directly connected (by at least one edge) to u, u would be adjacent to v, and v would be connected to the rest of G, but there would be NO other edges. Picture:
              K ====== u ------- v ======= (rest of G) 
      
      Here "=======" means one or more edges while "-------" represents a single edge. But this situation contradicts our choice of v, since removing u from G would produce a smaller connected component, namely K. Therefore, there must be some edges connecting K directly to v. It is then clear that by removing u, we would still have a connected graph. Picture:
       
              K ====== v ======= (rest of G) 
      
      This is a contradiction. So the original claim (theorem) has been established.

  5. I have been persuaded to hand out my handwritten lecture notes as we go along. I'll give you the first 40 pages next time!

  6. I'm hoping to get your HW tomorrow (Wed.) so I can turn it back to by by Friday, and so before the test. However, I will accept it as on-time if you turn it in this Friday.

  7. The old exam posted above might be of some use to you. Unfortunately there are some missing figures that got lost.

  8. Hopefully you are now convinced, not because I say so, but because you can "see it," that the union of two regular languages is regular, and that the composition of two regular languages is regular. We stopped before getting to page 51 of my notes. I made a "standard mistake" in my proof of the second theorem on that page. The first one is OK. For the second one, let's start the proof again. Let N be an NFA with language L. We can add a new start state to L and hook up an epsilon-transition from it to the old start state (which is no longer the start state). However, let's also make the new start state accepting. Call the resulting NFA N'. The language of N' is easily seen to be {epsilon} union L, which is the same thing as L0 union L1. This language contains epsilon regardless of whether or not L does, that is, regardless of whether or not the starting state in N (original NFA) is accepting. Next, hook up epsilon transitions from each accepting state to the new start state. By the previous result (on page 51), the language of the resulting NFA is ({epsilon} union L)+. But this is clearly L*.

  9. Here is a little more explanation of my "subset construction" example for turning an NFA into a DFA with the same language. Consider the state labeled {a,c} in the DFA. This DFA state corresponds to the idea "maybe the NFA is in state a or maybe it's in state c." What would happen on an input of the symbol 1 ? Well, let's see, in the NFA, from a, you can get to b, but also, from c, you can get to both c and d. Therefore, in the DFA, you put in a transition, on the input 1, from the state {a,c} to the state {b,c,d}. And so forth.

  10. I forgot something important on page 43 of my lecture notes. It would be an item between items 1 and 2. Let me try to capture the whole deal here. I will used the following notation since I cannot manage math symbols here (at least not without more effort that I plan to put into this):

    • I will write "d" in place of "delta",
    • and I will write "dh" in place of "delta hat,"
    • and I will write "e" for "epsilon,"
    • and I will write "s" for "sigma,",
    • and I will write "t" for "tau."

    Then the following things should be true in general for states q, q', q'', and for input strings s and t:

    • q is in dh(q,e);
    • if q' is in d(q,e) then q' is in dh(q,e);
    • if q' is in dh(q,e) and q'' is in dh(q',e) then q'' is in dh(q,e);
    • dh(q,e) is as small as possible subject to the above three requirements;
    • if s is a symbol and q' is in d(q,s), then q' is in dh(q,s);
    • if q' is in dh(q,s) and q'' is in dh(q',t) then q'' is in dh(q,st);
    • dh(q,s) is as small as possible subject to the above requirements.

  11. Qasim and I worked out more examples that we would like to share with you:

    • Consider an NFA with states A, B and C. Let's say that on input 0, the following transitions are possible: A->A , A->B , C->B and C->C . Let's say that on input 1, the following transitions are possible: A->C, B->A, B->C . Let's say that A is starting and accepting and that C is also accepting. (Stop and draw the state diagram for this.)

      The subset construction produces a DFA with the same language, but 8 states labeled using the 8 subsets of {A,B,C}. Here are the transitions on a 0 input:
          {}      ->   {} 
          {A}     ->   {A,B} 
          {B}     ->   {} 
          {C}     ->   {B,C} 
          {A,B}   ->   {A,B} 
          {A,C}   ->   {A,B,C} 
          {B,C}   ->   {B,C} 
          {A,B,C} ->   {A,B,C} 
      

      Here are the transitions on a 1 input:
       
          {}      ->   {} 
          {A}     ->   {C} 
          {B}     ->   {A,C} 
          {C}     ->   {} 
          {A,B}   ->   {A,C} 
          {A,C}   ->   {C} 
          {B,C}   ->   {A,C} 
          {A,B,C} ->   {A,C} 
      

      The starting state is of course {A}. The only non-accepting states are {} and {B}, because the others have either A or C as part of their label. Notice that being in the state {A,B} in the DFA is like saying "I could be in state A or B in the NFA." From here an input or 1 takes you to {A,C} because in the NFA you can go from A to C or from B to A or from B to C.

    • We also considered the language that consists of all binary strings except 11. But this is the complement of the language { 11 } , complementing in the language of all binary strings, that is, {0,1}*. It is possible to make a 4-state DFA whose language is { 11 } . Then just reverse which states accept and which do not, and you have a DFA whose language is the desired language. We need to use a DFA, not an NFA here, right?

    • Let's try another example. Say L1 is the language of binary strings that begin and end with a 1. Say L2 is the language of binary strings that have an odd number of 1's. Here are two DFAs for these languages, as you will no doubt check for yourself:


      Let's now draw the "Cartesian product" of these two automata, but ignore accepting states for now. We get the following, as you should be able to check:


      Notice that I did not indicate accepting states here. First get comfortable with the idea that each state here is keeping track of a pair of states, one from the first automaton, and one from the second. The states in the top (bottom) row correspond to an even (odd) number of ones. Notice that the state in the bottom-left position is unreachable from the start state. It can be removed, although it is part of the Cartesian product construction.

      Now figure out which state(s) should accept if you want the language to be the intersection of the original two languages. Then decide again if you want the language to be the union instead of the intersection.

  12. Project One is now posted. I added a link to the course website from two years ago. There are several useful messages there related to Project One. When you're ready to start work on this project, go check out the messages from the one that begins "Technical Issue:" up through the one that begins "More on Project One:".

  13. On Exam Two, you will need to use something called the "Pumping Lemma" to PROVE that a certain language is NOT regular. There will be NO PROOFS on Exam Three.

  14. Notice that I posted the website for the course two years ago. As you can see from a quick scan of it, the course content builds in complexity. Even though there won't be much emphasis on proofs on exams, you seriously need to consider the nature of this course and the need to keep up with the material. If you feel that you are in over your head, then let's discuss either doing extra work outside of class or possibly withdrawing from the course, if you feel it is too late to catch up.

  15. I will try to remember to announce this in class. Please remind me. There will be an interesting mathematics talk concerning "capillary surfaces" on Wednesday. Here are the details:
     
    Title: Capillary surfaces and their behaviors in cylinders
    
    Speaker: Dr. Kirk Lancaster, Professor of Mathematics, Wichita State University
    
    Date and time: Wednesday, February 22, 3:30 pm Location: Howard Hall 308
    
    Following the talk, Prof. Lancaster will be available to talk with students who are thinking about graduate school.
    
    Abstract: Capillary surfaces are ubiquitous: a drop on your windshield, the sap in a tree, the fuel in a spaceship, fluid in a cylinder, etc. 
    Capillary phenomena occur as the result of an interaction of surface tension, exterior force fields (e.g. gravity) and the attraction of fluids
    to surfaces (i.e. "wetting energies"). In a microgravity environment such as in free fall or at very small scales such as occurs in nanoscale
    fabrication, the influence of gravity becomes largely insignificant and the interaction of surface tension and surface chemistry becomes
    dominant in determining the shapes of stationary liquids.
    
    Special types of capillary surfaces can be of great interest; some examples are hanging drops ("pendant drops"), drops on a surface ("sessile
    drops"), liquid bridges, fluids in a vertical cylinder and fluids in a container.  When the container is not smooth (e.g. has edges or
    corners), determining the behavior or shape of a capillary surface in the container can be problematic.  I will consider capillary surfaces in
    a vertical cylinder whose cross-section has a corner P. In this case, the capillary surface will be a graph z=f(x,y) over the cross-section and
    the function f may be discontinuous at P; this problem has been investigated by a variety of researchers.  When the cross-section has a convex
    (or protruding) corner at P, Paul Concus and Robert Finn made a conjecture approximately 14 years ago that under certain conditions, f must be
    discontinuous at P. I will describe the behavior of capillary surfaces z=f(x,y) near P when f is discontinuous at P and give a very brief
    outline of my recent proof of the Concus-Finn Conjecture.  I will also discuss examples. 
    
    

  16. To clarify something: Projects are to be individual efforts, though you are allowed to get help from me.

  17. I just posted the homework on the pumping lemma, but want to stress again (if you want credit on the exam) that you must PROVE that such-and-such-a language does not satisfy the P.L.P. You need to write complete and correct sentences for this, not just scratch down a bunch of symbols. Read my examples CAREFULLY.

  18. I posted Chapter Two homework on context-free grammers.

  19. A couple things were decided after class today:

    • For simplicity, you may assume that the input file has exactly one number per line.
    • You should write the completelyReduce function and you should arrange for it to eliminate the original states (but not the new start nor the new accept state) in increasing numeric order.
    • You guys should let me (the user) specify the name of the file containing the data.

  20. Here is what happens to the project example, as states are eliminated in the order 1,2,3,4.

    • Here is how things should be to begin with (see bottom of this message):
       
                RegExpr[0][0] = "e"
                RegExpr[0][1] = "e"
                RegExpr[0][2] = "p"
                RegExpr[0][3] = "p"
                RegExpr[0][4] = "p"
                RegExpr[0][5] = "p"
                RegExpr[1][0] = "p"
                RegExpr[1][1] = "e"
                RegExpr[1][2] = "2"
                RegExpr[1][3] = "2"
                RegExpr[1][4] = "p"
                RegExpr[1][5] = "e"
                RegExpr[2][0] = "p"
                RegExpr[2][1] = "p"
                RegExpr[2][2] = "e"
                RegExpr[2][3] = "p"
                RegExpr[2][4] = "p"
                RegExpr[2][5] = "e"
                RegExpr[3][0] = "p"
                RegExpr[3][1] = "p"
                RegExpr[3][2] = "p"
                RegExpr[3][3] = "1"
                RegExpr[3][4] = "1|2|e"
                RegExpr[3][5] = "p"
                RegExpr[4][0] = "p"
                RegExpr[4][1] = "2"
                RegExpr[4][2] = "p"
                RegExpr[4][3] = "e"
                RegExpr[4][4] = "e"
                RegExpr[4][5] = "p"
                RegExpr[5][0] = "p"
                RegExpr[5][1] = "p"
                RegExpr[5][2] = "p"
                RegExpr[5][3] = "p"
                RegExpr[5][4] = "p"
                RegExpr[5][5] = "e"
                
    • After eliminating state 1:
                RegExpr[0][0] = e
                RegExpr[0][2] = 2
                RegExpr[0][3] = 2
                RegExpr[0][4] = p 
                RegExpr[0][5] = e 
                RegExpr[2][0] = p 
                RegExpr[2][2] = e
                RegExpr[2][3] = p 
                RegExpr[2][4] = p 
                RegExpr[2][5] = e 
                RegExpr[3][0] = p 
                RegExpr[3][2] = p 
                RegExpr[3][3] = 1 
                RegExpr[3][4] = 1|2|e 
                RegExpr[3][5] = p
                RegExpr[4][0] = p 
                RegExpr[4][2] = 22 
                RegExpr[4][3] = e|22
                RegExpr[4][4] = e
                RegExpr[4][5] = 2
                RegExpr[5][0] = p 
                RegExpr[5][2] = p  
                RegExpr[5][3] = p  
                RegExpr[5][4] = p  
                RegExpr[5][5] = e
           
    • After eliminating state 2:
                RegExpr[0][0] = e
                RegExpr[0][3] = 2
                RegExpr[0][4] = p
                RegExpr[0][5] = e|2
                RegExpr[3][0] = p 
                RegExpr[3][3] = 1
                RegExpr[3][4] = 1|2|e  
                RegExpr[3][5] = p
                RegExpr[4][0] = p 
                RegExpr[4][3] = e|22 
                RegExpr[4][4] = e
                RegExpr[4][5] = 2|22  
                RegExpr[5][0] = p
                RegExpr[5][3] = p  
                RegExpr[5][4] = p  
                RegExpr[5][5] = e
           
    • After eliminating state 3:
                RegExpr[0][0] = e
                RegExpr[0][4] = 21*(1|2|e)
                RegExpr[0][5] = e|2
                RegExpr[4][0] = p  
                RegExpr[4][4] = (e|22)1*(1|2|e)
                RegExpr[4][5] = 2|22  
                RegExpr[5][0] = p
                RegExpr[5][4] = p  
                RegExpr[5][5] = e 
           
    • After eliminating state 4:
                RegExpr[0][0] = e
                RegExpr[0][5] = e|2|21*(1|2|e)((e|22)1*(1|2|e))*(2|22) 
                RegExpr[5][5] = e
           

    A correction: Epsilon is always an implicit transformation from a state to itself. So, in my original example, where I had RegExpr[0][0] = p, it should actually be RegExpr[0][0] = e. And so forth. Why does it matter? Concerning the concatenation operation, we have the following analogy to multiplication of numbers. Phi is to zero, as epsilon is to one. Big difference! Also, phi star equals phi, and epsilon star equals epsilon. Epsilon is the "identity" for the concatenation operation, while phi is the "identity" for the "or" ("union") operation (i.e. vertical bar). This is analogous to the fact that 1 is the identity for multiplication, while 0 is the identity for addition, in ordinary number arithmetic.

  21. Project Two is now posted.

  22. Tried three cases for Project One. Here are the test data:

    •       4 -2 2 8 1 2 2 1 3 2 3 3 1 3 4 1 3 4 2 4 1 2 3 4 0 4 3 0
    •       3 -1 2 6 1 2 1 2 3 1 3 1 1 2 1 2 3 2 2 1 3 2
    •       2 1 1 1 1 2 1

    The corresponding answers should be essentially like so:

    •       e|2|21*(1|2|e)((e|22)1*(1|2|e))*(2|22)
    •       e|1(e|21)*2|(2|1(e|21)*(1|22))(e|12|(2|11)(e|21)*(1|22))*1|(2|11)(e|21)*2
    •       1

    Your answers might not look quite the same, but it should be straightforward to see how to reduce your answers to the above answers.

    I'm going to give you another couple days to patch up your code so that it will read from my data files, and LET ME SPECIFY which file. That's all you are allowed to change. This is really a pain for me to wrestle with the different ways you want to read the data (some using a single line, and some using on integer per line). Argh! Look, here are my data files (3 versions for each of 3 test cases). Make sure I can process at least one version for each test case:

    dataset1v1     dataset1v2     dataset1v3
    dataset2v1     dataset2v2     dataset2v3
    dataset3v1     dataset3v2     dataset3v3

  23. Here are the three extra Pumping Lemma "homework" exercises (that I won't collect, and I fixed the first one):

    • 4. L = { ambncm   |   m, n >= 3 }
    • 5. L = { ww   |   w in {0,1}* }
    • 6. L = { www   |   w in {0,1,2}* }
    • 7. L = { 0w1w   |   w in {0,1}* }
    • 8. L = { wtwREV   |   w, t in {0,1}* }
    • 9. L = { sigma in {a,b,c}*   |   each of the three letters (a,b,c) occurs the same number of times in sigma, and no letter is immediately repeated in sigma }

  24. Exam Two (next Monday) will have five problems of equal weight. Three of these will deal with CFGs (no PDAs). One will deal with state elimination. One will deal with the Pumping Lemma. This will have two parts. The first part will be a straightforward application of the PL. It will be similar in nature to our easier examples and exercises. But you must giva a clear proof that is at least as thorough as the ones I've given you. The second part will be more challenging, but not as bad as some of the hardest examples and exercises.

  25. Here is the kind of complicated thinking that is required on some of these CFG problems. Let's consider Exercise 2.6(b). We're trying to express the language made up of symbols a and b that are not of the form anbn. Teclar and my solution requires five variable symbols: S, A, B, X and Y. Now, I want A to represent a string of at least one a's and no b's. For this I introduce the rules A --> a | aA . I also introduce the rules B --> b | bB so that B represents a string consisting of only b's, and no a's. I want X to represent some number of a's followed by the same number of b's. For this I use X --> epsilon | aXb . I want Y to represent any binary string, so I use Y --> epsilon | aY | bY . Now, I will just give you my transformation rules for S and let you think about why they are appropriate: S --> bY | Ya | YbaY | AX | XB . That's it.

  26. Project Two says to use "cs160" drop box. Of course I meant "cs139".

  27. I posted the second exam from two years ago in the soltion section (above). As with the first exam, there are missing figures and parts of the exam may not be relevant to us this year.