The Learnability of Quantum States

The Learnability of Quantum States  Scott Aaronson University of Waterloo

Outline A Quantum Occam’s Razor Theorem - Why you should want it to be true - Why it is true - Application to quantum communication - Application to quantum advice Sneak Preview: Quantum Software Copy-Protection - What it has to do with learning - Why it might be possible

The Sun David Hume (1711-1776) Why do we believe the sun will rise tomorrow? The hypothesis that it will rise every day until tomorrow is equally compatible with evidence… In my view, a branch of CS called computational learning theory has pretty much solved this Humean Problem of Induction, insofar as it has a solution…

In particular: If you want to output a hypothesis from set H that explains at least a 1- fraction of future data with probability at least 1-, then data points suffice. Occam’s Razor Theorem(Valiant, Vapnik, Blumer et al…) “If the possible hypotheses have sufficiently fewer bits than the data you’ve collected, and if one of those hypotheses succeeds in explaining your data, then that hypothesis will probably also explain most of the data you haven’t collected”

“Operationally meaningful subset” HILBERT SPACE Trouble in QuantumLand Fear not, physicists! Why would he even be raising this “dilemma” if he wasn’t gonna demolish it on the very next slide? To describe a quantum state of n qubits takes ~2n classical bits Indeed, traditional quantum state tomography requires (22n) measurements on copies of the state Does this mean that a generic 10,000-particle state can never be “learned” within the lifetime of the universe? If so, would call into question the operational status of many-body quantum states themselves…

The Quantum Occam’s Razor Theorem Let  be an n-qubit mixed state. Let D be a distribution over two-outcome measurements. Suppose we draw m measurements E1,…,Em independently from D, and then output a “hypothesis state”  such that for all i. Then provided /10 and we’ll have with probability at least 1- over E1,…,Em

Q: But what if I can’t estimate the Tr(E)’s? What if for each measurement E, all I get is a bit that’s 1 with probability Tr(E) and 0 with probability 1-Tr(E)? A: In that case you need this many measurements: Upshot for Experimentalists You can do “pretty good tomography” on an arbitrary entangled state of n spins, using a number of measurements that scales only linearly (!) with n Here “pretty good” means with respect to anyfixeddistribution over observables

To prove the theorem, we need a notion introduced by Kearns and Schapire called Fat-Shattering Dimension Let C be a class of functions from S to [0,1]. We say a set {x1,…,xk}S is -shattered by C if there exist reals a1,…,ak such that, for all 2k possible statements of the form f(x1)a1-  f(x2)a2+  …  f(xk)ak-, there’s some fC that satisfies the statement. Then fatC(), the -fat-shattering dimension of C, is the size of the largest set -shattered by C.

Small Fat-Shattering Dimension Implies Small Sample ComplexityProof uses a 1996 result of Bartlett and Long Let C be a class of functions from S to [0,1], and let fC. Suppose we draw m elements x1,…,xm independently from some distribution D, and then output a hypothesis hC such that |h(xi)-f(xi)| for all i. Then provided /7 and we’ll have with probability at least 1- over x1,…,xm.

No need to thank me! Upper-Bounding the Fat-Shattering Dimension of Quantum StatesProof uses Ashwin Nayak’s lower bound for “quantum random access codes,” which in turn uses Holevo’s Theorem on quantum channel capacity Let S be the set of two-outcome measurements on n qubits. Let Cn be the set of functions f:S[0,1] defined by f(E)=Tr(E) for some n-qubit mixed state . Then Quantum Occam’s Razor Theorem is then just plug & chug…

GBUSTERS L Simple Application of Quantum Occam’s Razor Theorem to Communication Complexity x y f(x,y) Alice Walker Bob Dylan • f: Boolean function mapping Alice’s N-bit string x and Bob’s M-bit string y to a binary output • D1(f), R1(f), Q1(f): Deterministic, randomized, and quantum one-way communication cost of f • How much can quantum communication save? • It’s known that D1(f)=O(M Q1(f)) for all total f • In 2004 I showed that for all f,D1(f)=O(M Q1(f)logQ1(f))

Theorem: R1(f)=O(M Q1(f)) for all f, partial or total Proof: By Yao’s minimax principle, Alice can consider a worst-case distribution Dx over Bob’s input y Alice’s classical message will consist of y1,…,yT drawn from Dx, together with f(x,y1),…,f(x,yT) Here T=(Q1(f)) Bob searches for a quantum message  that yields the right answers on y1,…,yT (certainly such a  exists) By the Quantum Occam’s Razor Theorem, with high probability such a  yields the right answers on most y drawn from Dx

Computational Complexity of Learning Quantum States I showed that, if you find a state  that explains O(n) measurements drawn from D, with high probability that  will correctly explain most future measurements drawn from D. This says nothing about the computational problem of finding ! Indeed, if  can always be prepared by a polynomial-time quantum algorithm, then no one-way function is secure against quantum attack.

PostBQP/poly BQP/qpoly QMA/poly YQP/poly QMA BQP/poly YQP BQP To say more, we need to visit the bestiary… YQP: Yaroslav Quantum Polynomial-Time Class of problems solvable efficiently on a quantum computer, with the help of polynomial-size untrusted quantum advice

Theorem: AvgBQP/qpoly = AvgYQP/poly Or in English: We can use trusted classical advice to verify that untrusted quantum advice will work on most inputs. Proof Idea:The classical advice will consist of “training inputs” x1,…,xm, as well as whether xiL for all 1im Given a purported advice state |, first check that | yields the right answers on x1,…,xm, and only then use it on the x you care about By Quantum Occam’s Razor Theorem, m=O(poly(n)) is enough to ensure | will work on most inputs w.h.p. The technical part is to do the verification without damaging | too badly

Quantum Copy-Protection We say a program P is copy-protected if there’s no efficient algorithm that, given P’s source code, outputs two programs with the same input/output behavior as P Classically, copy-protection is trivially impossible(tell that to Sony/BMG…) Quantumly: well, it’s called the “No-Cloning Theorem” for a reason… Connection to learning: If P can be learned from input/output behavior, then it can’t be copy-protected

A Weird Example Let G be a finite group, such that we can efficiently prepare |G (a uniform superposition over gG) Let HG be a subgroup with |H|  |G|/polylog|G| Let f(g)=1 if gH and f(g)=0 otherwise Given |H (a uniform superposition over H), Watrous showed that we can efficiently compute fTest whether |H and |gH are equal or orthogonal Conversely, given a black box that computes f, we can efficiently prepare |HFirst prepare |G, then postselect on f(g)=1 So any program for f can be pirated—but (apparently) only in an indirect, quantum way

The Pirate’s Nightmare In the quantum world, can any program that can’t be learned be copy-protected? Main Result: There exists a “quantum oracle” relative to which the answer is yes Upshot: Even if the answer is no, we can’t prove it without using “quantumly nonrelativizing techniques”

Handwaving Proof Idea For each circuit C, choose a “meaningless quantum label” |C according to the Haar measure The quantum oracle will map |C|x|0 to |C|x|C(x), as well as |C|0 to |C|C Intuitively, then, being given |C is “no better” than being given a black box for C To prove this, we need to simulate an algorithm that prepares |C given another copy of |C, by an algorithm that prepares |C given only black-box access to C Strategy: Mimic the copying algorithm, by “mocking up” a random pure state | that plays the same role as |C Problem: “Mocking up” a random pure state takes exponential time

Solution: Pseudorandom States where p is a degree-d univariate polynomial over GF(2n) for some d=poly(n), and p0(x) is the “leading bit” of p(x) Clearly the |p’s can be prepared in polynomial time Lemma: If p is chosen uniformly at random, then |p “looks like” it was chosen under the Haar measure- Even if we get polynomially many copies of |p- Even if we query the quantum oracle, which depends on |p So the simulator can use |p’s in place of |C’s

r r DUNCE DUNCE Open Problems Can we tighten the Quantum Occam’s Razor Theorem?The best lower bounds I can prove go like (n/2), or (n/4) in the case where each measurement is applied only once Does BQP/qpoly = YQP/poly?I.e., can we use classical advice to verify quantum advice in the worst-case setting? Is D1(f) = O(M Q1(f))? Or even O(M+Q1(f))?Even more ambitiously, could learning theory techniques help us show that R1(f)=O(Q1(f)) for all total f? In the real world, are there nontrivial programs that can be quantumly copy-protected?What about point functions (f(x)=1 if x equals a secret password s; otherwise f(x)=0)?

The Learnability of Quantum States

The Learnability of Quantum States

Presentation Transcript

Quantum decoherence of excited states of optically active biomolecules

Density of states and frustration in the quantum percolation problem

Magnetized States of Quantum Spin Chains

SDP hierarchies and quantum states

Superpolynomial complex quantum states

Bounding the Elliptope of Quantum Correlations Proving Separability in Mixed States

Adaptive Detection of Arbitrarily Shaped Ultrashort Quantum Light States

Lecture 2: Learnability

Modeling of Energy States of Carriers in Quantum Dots

Randomizing Quantum States

Welcome to “Quantum Engineering of States and Devices”!

Generation of quantum states of light by a semiconductor quantum dot

Quantum reflection of S-wave unstable states

Visualization of Quantum States

Tomographic approach to quantum states of electromagnetic radiation and spin states

Cloning of quantum states

New Results on Learning and Reconstruction of Quantum States

Shadow Tomography of Quantum States

Are Quantum States Exponentially Long Vectors?

Quantum decoherence of excited states of optically active biomolecules

PAC-Learning and Reconstruction of Quantum States

The Learnability of Quantum States