There is also a use of the bare term "enumerable" applied to sets, but this means something different. We'll come back to that later. When I say that a particular automaton or kind of automaton can enumerate a set, that's one thing; when I say that the set is enumerable, that's different. The uses have some relation to each other, but "enumerable" doesn't mean "can be enumerated by some conceivable automaton."
There is also a use of the term "decidable" to talk not about sets but about sentences in a deductive system. That again is somewhat different from the use introduced here. There what's meant is whether the sentence is either provable or refutable in the specified deductive system. This has some connections to the notion of decidability introduced here, but they're not the same thing. A sentence's not being decidable in a specific formal system doesn't in itself settle the question whether there might be other ways to effectively calculate its truth-value. It would probably be best if we used different terms here, like "deductively decidable (in system so-and-so)" vs "effectively decidable."

### Deciding whether a string belongs to some formal language ###
Our present question is, given an alphabet $\Sigma$ and some formal language $L \se \Sigma^*$ --- usually specified by a grammar that generates it --- what will it take to decide membership in $L$. That is, what kinds of automata will be able to decide, for arbitrary strings $S \in \Sigma^*$, whether $S \in L$, yes-or-no? When an automaton can do this, we say that it **decides membership in** the language $L$.
Membership in many formal languages can be decided by automata much weaker than Turing machines.
There turn out to be deep and interesting parallels between the categories of grammar we mentioned before (regular grammars, context-free grammars, and some more complex grammars) and what kinds of automata are capable of recognizing the languages those grammars generate.
Earlier I introduced the notion of a **regular language**, as being a language generatable by some regular grammar.
It's also possible to define these languages in algebraic terms. One definition goes:
* Let $\Sigma$ be a finite alphabet, and $f$ be a monoid (homo)morphism from $\Sigma^*$ to some finite monoid $M$. Let $N$ be any subset of $M$. Then $\set{w\in \Sigma^* \where f(w) \in N}$ is a regular language.
Another definition goes:
* Define an equivalence relation $\sim$ on a language $L\se\Sigma^*$, where
$u\sim v$ iff $L$ is closed wrt prefix replacement of $u$ by $v$; that is, $\forall w\in\Sigma^*(u\concat w\in L\latin{ iff }v\concat w\in L)$. Then $L$ is regular iff the number of equivalence classes $\sim$ partitions it into is finite.
We'll come back to the notion of an equivalance class in a few weeks.

It turns out that regular languages are exactly the languages whose membership is decidable
by a very weak kind of automaton known as a **finite-state**
machine or automaton (sometimes abbreviated FSM or FSA). For those languages,
there is always an finite-state automaton capable of giving a definite and
correct answer to the question whether an arbitrary string belongs to the
language. Of course, I don't mean you could build one and it will never break.
As I've said, these are mathematical models or recipes, not real machines. I
called them "weak" because they are limited in certain ways; but this category
includes automata that are too complex for humans to ever build a real instance
of. Don't worry, though. The ones we will work with will be very simple. You'll
be able to draw them on a single piece of paper.
Given a regular language, there is always a finite-state automaton capable of deciding which strings belong to it and which don't. The converse is also true; given any finite-state automaton, there will be a regular language whose membership it decides. (It may be that it is a language recognized by distinct finite-state automata, as well.) This is what we learn from Kleene's Theorem, discussed in the Epp reading. Kleene's Theorem explicitly concerns the relation between finite-state automata and languages generated by Kleene's regexes, not languages generated by regular grammars. But it is straightforward to translate between these regexes and regular grammars, and the languages they generate are the same. (For more details, see the optional Sipser Chapter 1 reading.)
In general, automata come in two varieties: **deterministic** and **non-deterministic**. Non-determinism here doesn't have to do with chance or randomness. It's probably a misleading label for philosophers. But this terminology is well-entrenched. I will explain what non-determinism is later. For the moment, we will focus on deterministic finite-state automata. These are often abbreviated as **DFA**.
DFAs are the kinds of machines you see drawn in the directed graphs on p. 223 of the Gamut reading, and on pp. 792 and following in the Epp reading. These automata have a number of *internal states*, represented by the vertices in those graphs. There are only a finite number of such states, that's why they are called *finite*-state automata. Graphs like these are also used in the specification of more powerful automata, but those more powerful automata also have memory registers, or a stack where they can write information to retrieve later. Turing machines have an infinite array of memory. Not the poor finite-state automata. They don't have any memory other than what is implicit in what vertex they happen to "occupy" at a given step of their computation. That is what makes them so weak.
So the DFA has a certain number of internal states. One of these is marked as the *starting state*. The DFA also has a
*next-state function*, which specifies what it should do next, depending on what state it happens to be in and what input it then receives.
For DFAs, the menu of what to do next is rather limited. There's no additional memory they can write to or read from. All they can do is move to another internal state. (Or they can stay in the state they're in.) How they will choose is specified by their next-state function, and displayed in the directed graphs you see. Suppose there's an arrow leading from vertex $s_1$ to vertex $s_2$, labeled with some letter. This means, if you're in vertex $s_1$ and you get that letter as input, then go to vertex $s_2$. Then you're at the next step of the computation. At each step, the automaton is fed a new input letter. It can't backtrack and see what it was fed before. It's just force-fed one new letter per step. These are the letters of the string they're trying to recognize, conventionally read from left-to-right. (Given the way I orginally defined strings, though, it'd be more natural to go in the other direction, wouldn't it?)
When a DFA has been fed all of the input, that's it. It doesn't do anything further. It just stays in whatever internal state it last moved to. However, we'll have marked some of the states as being "accepting states" and the rest as being "rejecting states." What that means is that if a string is fed to the automaton and when it stops it's in a "accepting state," then its answer to the question whether that string belongs to the language is "yes." Else the answer is "no." That's about it.
When there are multiple accepting states, we sometimes attach special signficance to which one the automaton stopped in.
HOMEWORK EXERCISES:
(51) Work through Epp's examples 12.2.4, 12.2.5, 12.2.6, 12.2.7. Try to answer the questions she poses. She then gives the solutions and talks through an explanation of them, so I don't want you to reproduce that. You can just tell me "Yes I've worked through these examples and think I understand them."
With the automata you just looked at, for every vertex and every possible next input letter, there was always exactly one arrow leading away from that vertex (perhaps back to the same vertex). We might adopt a convention to sometimes allow there to be zero arrows for some input letters. That's to be understood as though there *were* an arrow with that label, but leading to an undisplayed vertex that was "rejecting" and which led back to itself for every possible further input. So a kind of black hole, from which no further input will summon a verdict of "accept." There's a graph at the bottom of Epp's p. 803 that is missing some arrows, so needs to be interpreted like this.
That graph at the bottom of p. 803 has another curious feature as well. It's *not just* that sometimes it has zero arrows for a given input: leading from state $s_0$ we see that there are also *two* arrows labeled with the input $1$. What could this mean?
This is what makes that automaton *non-deterministic*. In particular, it's a non-deterministic finite-state automaton, often abbreviated as **NFA**.
You're *not* supposed to understand "non-determinism" here as meaning that the automaton chooses one of the arrows randomly. Rather, you should think that all of the outgoing arrows are (potentially) explored. You could think of all the explorations running simultaneously, on parallel computer chips, each consuming the next piece of input at the same time. (And perhaps in the course of doing so, we'd have to split into multiple processes again... and again...) Or you could think that the NFA just explores one outgoing arrow at a time, but keeps a snapshot of where it was before it started, so that if things go bad it can rewind to this point, including rewinding the input, and start again with another arrow. If *any* of the explorations lead to the input being exhausted when the NFA is in an "accepting" state, then the string counts as accepted. (Any pending snapshots can then just be ignored.) If *none* of the explorations end with the NFA stopped in an "accepting" state, then the string counts as rejected.
Sometimes, the arrows in an NFA's decision graph are labeled with an $\emptystring$ instead of a letter from the relevant alphabet. When the automaton enters a state with one or more outgoing arrows labeled with $\emptystring$, then --- before consuming any further input --- it "splits" into multiple exploration paths, one following each of those arrows, and one remaining in the current state. Otherwise this works just as described above.
In the case of *finite-state* automata, there is only a practical difference between the deterministic and non-deterministic varieties. It's possible in principle to translate any NFA into a corresponding DFA (though it may be complicated). In the case of more powerful automata, though, there is sometimes a theoretical difference, and the non-deterministic versions can recognize more languages.
Even in the case of NFAs, it can in practice be difficult to construct a corresponding DFA. But the example on Epp's p. 803 is not of this sort. You can easily produce a DFA that recognizes the same input as that NFA.
HOMEWORK EXERCISE:
(52) There are some issues with the example on the bottom of Epp's p. 803. First of all, the displayed automaton lacks outgoing arrows for some possible inputs. But I explained above how we can interpret that. (Let's suppose the possible input letters are just $0$ or $1$.) Another issue is that in the text accompanying her diagram, she says that the automaton "accepts the set of all strings beginning with a 1." But that is not correct; anyway, not if we interpret the absence of arrows in the natural way I've proposed. Ignore Epp's claims about what the automaton does and just look at the diagram. Describe a DFA, and a regex, that accept exactly the same strings as the NFA displayed in that diagram.
### More powerful automata ###
Membership in languages that are non-regular can only be decided by automata that are more powerful than finite-state automata.
* One way to get a more powerful automaton is to equip it with a "stack", where the automaton can leave notes to itself and base decisions upon later. These are called [pushdown automata](http://en.wikipedia.org/wiki/Pushdown_automaton). Given any language generated by a context-free grammar, one can specify a nondeterministic pushdown automaton, needing only finitely many states and a single stack, which can decide membership in that language. (See the end of the optional Sipser Chapter 2 reading for details.)
* If you equip a pushdown automaton with *two* stacks, then it becomes much more powerful --- in fact, as powerful as a Turing machine. But there are automata intermediate in strength between one-stack pushdown automata and Turing machines. Some of these are called linear-bound automata; they're essentially a Turing machine (see below) handicapped by having only a finite amount of memory. These are capable of deciding membership in languages generated by context-sensitive grammars, which we've mentioned but haven't (and won't) discuss in detail.
As I mentioned earlier, Turing machines are regarded as fundamentally the most powerful automata we can specify. In the early part of the 20th century, there were a variety of strategies for formalizing the notion of "effectively calculating" some result, and these all turned out to be provably equivalent to Turing machines.
(There's no reason why we have to take Turing machines rather than the other formalisms as being more fundamental; it's just typical pedagogy to focus on them.)
At the same time, we've accumulated a variety of questions that we can prove *aren't* answerable, in general, by a Turing machine. (Usually, the questions will be answerable in special cases. But not for arbitrary well-defined inputs.) This has prompted some theorists to explore the notion of [Hypercomputers](http://en.wikipedia.org/wiki/Hypercomputation). Such machines are most likely not physically constructable, but arguably neither are Turing machines. Regardless of that, many theorists aren't inclined to count the methods employed by hypercomputers as still being "effective," that is, the kind of rote or mechanical procedure we're trying to formalize. So generally, Turing machines, and the other formalisms that are equivalent to them, are thought of as constituting the limits of what can be effectively calculated or decided. (This belief is called the **Church-Turing thesis**.)
Like the other enhanced automata we described, Turing machines are equipped with memory apart from the vertices of their decision graph. This memory is thought of as a tape that is potentially infinitely long, in at least one direction.
Turing machines read their input from, and write their output to, that tape, rather than being force-fed it one letter at a time, as finite-state automata are.
* For languages generated by the most complex grammars, even Turing machines aren't able to decide whether arbitrary strings are included. They are however able to enumerate all the strings that are included.
I believe that no natural category of grammar is known to generate exactly the languages where Turing machines *can* decide membership.
It is provable that there are more languages (sets of strings over a given alphabet) than there are Turing machines. It follows that some languages aren't even enumerated by any Turing machine. We cannot give grammars that generate those languages; but we can formally specify some of them. For any language $L\se\Sigma^*$ that Turing machines can enumerate but not decide membership in, the language $\Sigma^*-L$ must be one that no Turing machine can enumerate. If some Turing machine *could* enumerate it, then we could specify a machine that emulated the enumeration of both $L$ and its complement, and stopped as soon as a specified string appeared in either list; thereby deciding yes-or-no whether that string belongs to $L$. Which by assumption, no Turing machine can do.