# Introduction to Automata Theory

Automata theory is the study of abstract machines (like finite state machines), automata (self-operating machines) and the computational problems that can be solved using them. It comes under the broad topic of Theoretical Computer Science, under Discrete Mathematics. Essentially, Automata Theory is the study of self-operating virtual machines to help with the logical understanding of input and output processes, with or without intermediate stage(s) of computation/ function/process. Automata play a major role in theory of computation, compiler design, artificial intelligence, parsing and formal verification and have been around since ancient times. Automata (pl.) are defined to study useful machines under mathematical formalism. So, the definition of an automaton is open to variations according to the “real world machine”, which we want to model using the automaton (sg.). There are four main components of any automata that define its variant:

Input :-

a. Finite input: An automaton that accepts only finite sequence of symbols.

b. In finite input: An automaton that accepts infinite words (?-words). Such automata are called ?-automata.

States :-

a. Finite states: An automaton that contains only a finite number of states.

b. Infinite states: An automaton that may not have a finite number of states, or even a countable (http://en.wikipedia. org/wiki/Countable_set) number of states. For example, the quantum finite automaton or topological automaton has uncountable infinity of states.

Transition function :-

a. Deterministic: For a given current state and an input symbol, if an automaton can only jump to one and only one state.

b. Nondeterministic: An automaton that, after reading an input symbol, may jump into any of a number of states, as licensed by its transition relation, i.e. the automaton non-deterministically decides to jump into one of the allowed choices.

Acceptance condition : –

We will be primarily concentrating on finite or countable infinite input on finite state machines in the rest of this article. Deterministic Finite Automata (DFA) and Non-deterministic Finite Automata (NFA) have the same computational power and both are capable of recognizing the same languages, called the regular languages that regular expressions can describe. A tutorial on regular expressions can be found on http://regexone.com/. Now if we take the language to be L = (a|b)*abb. Then L will be the set of strings that are accepted by this regular expression, which is read as “a or b, zero or many times, followed by the string abb” or “all the strings in the alphabet {a,b} that end in abb”. Language L = (a|b)*abb = {abb, aabb, babb, aaabb, ababb, baabb, bbabb,…} superscript (*) – Kleene closure – zero or many superscript (+) – Positive closure – one or many e – Epsilon – empty string The NFA for this language L is depicted in Fig. 1. States are represented by numerals in circles and the transition between states is described by alphabets over the arrow.

The start state is 0 and the end state (or accepting state) is 10, which is denoted by the double-circle. Sometimes the end state is realized by making the said state accept the transition labeled #. An NFA accepts input string x if and only if there is a path in the transition graph from the start state (0) to one of the accepting states, such that the symbols along the path spell out x. So, the input stringbbabb is accepted because of the path 0-1-4-5-6-1-4-5-6-7-8-9-10. [optional] Q. Find out three strings that are not accepted by the NFA for L. A DFA is a special case of an NFA where:

1. There are no moves on input e, and

2. For each state s and input symbol a, there is exactly one edge out of s labeled a.

This is the reason why a DFA shares the same notation as an NFA. While the NFA is an abstract representation of an algorithm to recognize the strings of a certain language, the DFA is a simple and concrete algorithm for recognizing strings. The DFA for L is shown in Fig. 2 The input string abbabb will be accepted by this DFA because of the path 0-1-2-3-1-2-3. Note that a string will be accepted by the DFA only if it ends on the end state (3). Deterministic Finite Automata (DFA) is the most standard variant and was invented to model and study real-life scenarios that the Turing Machine (https://en.wikipedia.org/wiki/Turing_machine) was unable to demonstrate. Since the DFA is used to implement or simulate when building lexical analyzers, it is fortunate that every regular expression or NFA sharing the same language can be converted into a DFA, which is beyond the scope of this article. [optional] We can represent either an NFA or DFA by a transition graph, where there is an edge labeled ‘a’ from state ‘s’ to state ‘t’ if and only if ‘t’ is one of the next states for state ‘s’ and input ‘a’.

In this graph, as shown in Fig. 3 for language L,

a. The same symbol can label edges from one state to many different states, and

b. An edge may be labeled by e, instead of or in addition to symbols from the input alphabet. For searching *.java files, a DFA can prove to be very useful if the filename is composed of ASCII characters (real language) only. Automata simulators are used to learn and research automata theory. An automata simulator takes as input the description of an automaton and then simulates its working for an arbitrary input string.

The description of the automaton can be entered in several ways. An automaton can be defined in a symbolic language or its specification may be entered in a predesigned form or its transition diagram may be drawn by clicking and dragging the mouse. Well known automata simulators include Turing’s World, JFLAP, VAS, TAGS and SimStudio In conclusion, each model in the automata theory plays important roles in several applied areas. Finite Automata are used in text processing, compilers and hardware design, and Context-Free Grammar (CFG) [http://en.wikipedia.org/wiki/Context-free_grammar] is used in programming languages and artificial intelligence and also for study in human languages. [optional] Let’s see if you can figure out what this transition diagram accepts as a language. Answer: Unsigned numbers (with E used as exponent for base 10).

Fig. 4. This is an example of how the transition diagram looks like in automata theory

Fig. 3 This is the transition graph for the regular expression

(a|b)*abb

Fig. 2 This transition graph is a DFA for the regular expression

(a|b)*abb

Fig.1 This transition graph is an NFA for the regular expression

(a|b)*abb