NP complexity class: Difference between revisions
imported>Pat Palmer (→Optimization problems: changed about to above) |
imported>Warren Schudy (Finished intro) |
||
Line 1: | Line 1: | ||
The '''NP complexity class''' (Non-deterministic Polynomial time) consists of all types of yes/no questions where whenever the | |||
answer is yes, there is an easily verifiable proof of this fact. For example, the '''traveling salesperson problem''' has as input | |||
a set of cities, the cost of available transit links between them, and a budget. The question is whether or not there exists | |||
an order than a salesperson can visit all of the cities without exceeding the budget. The easily verified proof in this case | |||
is simply the order to visit the cities. The definition of "easily verifiable" is that the verification can be done in time | |||
polynomial in the length of the input. See technicalities section for a formal definitions of these terms. | |||
A type of yes/no question is called a '''decision problem'''. Any decision problem that can be solved in polynomial time, such as | |||
"does 24554+254=5987987", is in NP - just use "trivial" as the proof. A natural question arises - what is the hardest problem | |||
in NP? The way to compare the hardness of problems is polynomial-time reductions. Problem B <b>reduces</b> to problem C if | |||
one could use a polynomial-time solution to problem C to solve problem B in polynomial time. Theoretically, problem B is no | |||
harder than problem C. | |||
It is widely suspected that no NP- | It turns out that every problem in NP reduces to the traveling salesman problem. The traveling salesman problem is not | ||
special in this regard; there are hundreds of so-called <b>NP-hard</b> problems which any problem in NP can be reduced to. If | |||
a problem is both NP-hard and in NP, it is '''NP-complete'''. NP-complete problems capture the essence of NP, so informal | |||
references to NP are usually references to NP-completeness. | |||
Put another way, the NP-complete problems all have a certain hardness. Problems in NP are the ones that are no <i>harder</i> than the NP-complete ones, and problems that are NP-hard are <i>at least as hard</i>. | |||
It is widely suspected that no NP-complete problem can be solved in polynomial time, but this has not been proven. The <math>P \neq NP</math> | |||
question is one of the greatest open problems in computer science and mathematics; the Clay math institute has offered a $1 million US dollar prize for a solution. | |||
=Technicalities= | =Technicalities= |
Revision as of 10:58, 18 November 2007
The NP complexity class (Non-deterministic Polynomial time) consists of all types of yes/no questions where whenever the answer is yes, there is an easily verifiable proof of this fact. For example, the traveling salesperson problem has as input a set of cities, the cost of available transit links between them, and a budget. The question is whether or not there exists an order than a salesperson can visit all of the cities without exceeding the budget. The easily verified proof in this case is simply the order to visit the cities. The definition of "easily verifiable" is that the verification can be done in time polynomial in the length of the input. See technicalities section for a formal definitions of these terms.
A type of yes/no question is called a decision problem. Any decision problem that can be solved in polynomial time, such as "does 24554+254=5987987", is in NP - just use "trivial" as the proof. A natural question arises - what is the hardest problem in NP? The way to compare the hardness of problems is polynomial-time reductions. Problem B reduces to problem C if one could use a polynomial-time solution to problem C to solve problem B in polynomial time. Theoretically, problem B is no harder than problem C.
It turns out that every problem in NP reduces to the traveling salesman problem. The traveling salesman problem is not special in this regard; there are hundreds of so-called NP-hard problems which any problem in NP can be reduced to. If a problem is both NP-hard and in NP, it is NP-complete. NP-complete problems capture the essence of NP, so informal references to NP are usually references to NP-completeness.
Put another way, the NP-complete problems all have a certain hardness. Problems in NP are the ones that are no harder than the NP-complete ones, and problems that are NP-hard are at least as hard.
It is widely suspected that no NP-complete problem can be solved in polynomial time, but this has not been proven. The question is one of the greatest open problems in computer science and mathematics; the Clay math institute has offered a $1 million US dollar prize for a solution.
Technicalities
Variants
- Strongly NP hard
- the problem is still NP-hard if all numbers in the problem are expressed in unary.
- Weakly NP hard
- the problem is polynomial-time solvable if all numbers in the problem are
Optimization problems
The definition of NP given above is for yes/no decision problems only. However, many natural problems are optimization problems, for example the traveling salesman problem, where the question is not whether or not a tour exists, but the cost of the cheapest one. Optimization problems are considered in this framework by considering a decision variant. For example, the decision variant of the traveling salesman problem is the question of whether or not there exists a tour with cost less than a given value. An optimization problem is classified as being NP-hard, in NP or NP-complete based on the classification of its decision variant.
Example NP-complete problems
Two NP-complete problems are particularly important because many other problems can be easily reduced to them: 3-SAT and Integer Programming.
3-SAT
Satisfiability is determining whether or not a given boolean formula in conjunctive normal form admits an assignment of values to the variables which satsfies all the clauses. The special-case of 3-SAT is the restriction of satisfiability to 3 variables per clause; one can easily convert a general satisfiability problem into a 3-SAT problem by adding extra variables and splitting clauses.
Integer Programming
If one removes the constraint that the variables be integer-valued and allows them to be real-valued, the result is the linear relaxation of the problem. Linear relaxations are crucial both for branch and bound and approximation algorithms (discussed below).
MAX CUT
Many important techniques, such as semi-definate programming for approximation algorithms, were first used to solve max cut and later extended to other problems.
Traveling salesman problem
This problem is particularly famous because it is easy to explain.
Knapsack
The knapsack problem is only weakly NP-hard.
Example problems whose hardness is unknown
Factoring
The problem of determining the factors of a composite integer is famously difficult. Factoring is in NP, but is not known whether or not it is NP hard. Factoring $n$ as such is not a decision problem, but one can ask the question of whether In fact, the related decision problem of determining whether an integer is prime or composite was shown to be in P.
Graph Isomorphism
Determining whether or not two given graphs are isomorphic is another important problem which is not known to either be polynomial-time solvable or NP-complete.
Nash equilibrium
Many problems related to Nash equiliria are not known to be polynomially-time solvable but are not known to be NP-complete. However, finding a Nash equilibria is PPAD-complete.
Solving NP-complete problems
NP-complete problems are extremely important in applications, so one cannot simply give up when a problem is shown to be NP-hard. Fortunately, there are at least three ways to get around NP-hardness. NP hardness is an asymptotic result of worst-case instances, so it is often possible to use branch and bound to solve the instances one cares about in a reasonable amount of time. Local search techniques cannot generally be proven to work well, but they often work excellently in practice. For optimization problems, one can
Branch and Bound
Local Search and other heuristic approaches
In computer science, a heuristic is a technique which works works in practice but comes with no theoretical guarentees of performance.
Local search techniques start with a solution and try to improve it. Local search techniques include:
- Simulated annealing
- inspired by nature's ability to find low-energy configurations in statistical physics, accept moves that reduce the quality of the solution with probability . Some simulated annealing algorithms can be theoretically shown to converge to the right answer if one waits infinitely long, but these theoretical guarentees are worse than exhaustive search and therefore of minimal import.
- Tabu Search
- to prevent the search from repeating past mistakes, keep an explicit list of changes made recently and avoid changing one's mind.
- Genetic Algorithms
- Inspired by biological evolution. Rather than keeping 1 current solution, keep a population of solutions. Simulate mutation and sexual reproduction.
Approximation Algorithms
Determining the exact optimum value of an NP-complete optimization problem in polynomial time is not possible unless P=NP, but many such problems can be solved approximately in polynomial time. For example, the MAX-CUT problem (see above) can be approximated to within a factor of two by simply choosing a random cut! To see that this is a two-approximation, note that firstly, the optimum cut can cut at most all the edges, and seconly, every edge will be cut with probability 1/2, so on average half of the edges will be cut.