Tuesday, November 16, 2010

Picking the bigger number of two even if one is unknown

Here is a nice problem from the xkcd blog: Two real numbers, A and B, are drawn using some unknown, possibly probabilistic process and written on papers that go into two envelopes. You randomly pick one and open it to find some number on it. You now have to decide whether you want to receive that number as an amount in dollars or rather the number that is in the other envelope (which is still sealed). Can you come up with a process that with probability >50% picks the larger amount?

Think about it.





SPOILER ALERT






You can. You will need some function f that maps the real numbers to the open interval (0,1) in a strictly monotonic way. You could for example take f(x) = (1+tanh(x))/2. Assume the number that you found in the first envelope was X. Then throw an unfair coin such that with probability f(X) you keep X and otherwise take the other envelope. Obviously (?), if you started with the envelope with the smaller number you are more likely to switch than if you had started with the larger number.

This sounds a bit counter intuitive. How can you increase your expected payoff if you know nothing about the number in the second envelope?

You might have the idea that something fishy is going on. What comes to mind is for example that you can produce paradoxes when assuming that there is a uniform probability distribution on the reals (or integers). But I believe, that this is not what is going on here since I did not say how the numbers were picked. They could have been picked with any perfectly fine probability measure on the reals, nobody said all numbers were equally likely. Below I will compute the expected outcome for any probability distribution that might have been used and it always works, not just in average.

More precisely, I think this is unrelated to a similar puzzle: In that second puzzle, there are also two envelopes that contain numbers representing payout but none is opened but instead it is known that the number in one envelope is twice the number in the other. You just don't know if you have the half oder double. There, assuming your envelope contains X then you could be tempted to argue that with probability 50% the other contains 2X and with 50% it contains X/2 and thus the expectation value is 50% x 2 X + 50% x X/2= 5/4 X and thus you could increase your expectation by 25% by switching. But then you could increase it by another 25% by using the same argument again and switching back.

In the second puzzle, it is really the implied uniform distribution of X's that is the origin of the paradox: You can see this by giving the additional information that both numbers are definitely smaller than 100 trillion dollars. That sounds like a trivial information but note that the calculation of the expectation value changes: If X is greater than 50 trillion dollars, you know with certainty that the other number cannot be 2X and thus the expectation of taking the other envelope is nor 125%X but X/2. If you now carefully go through the expectation value calculation you will find that averaged over all values of X the expectation for switching is the same as for keeping the first envelope.

Some of my readers will notice that the second puzzle is related to recent arguments that were made in the Landscape scenario about the imminent end of the world.

Back to the first game. Let's do some calculation to compute the expectation of the outcome. We will assume that the numbers were picked according to some probability measure rho(x)dx and that has a finite expectation value, i.e. the integral E=E(X)=int x rho(x)dx converges.

Then the expected outcome of the strategy above is X with probability f(X) and E with probability (1-f(X)) (as in that case we take the number in the second envelope).

We can now compute the expectation E(f(X) X + (1-f(X))E)= E(f(X)X) + E - E E(f(X)). For simplicity assume that E=0. Otherwise we could pay out E immediately and then subtract E from all number in the envelopes. Thus the expected payout of our strategy is E(f(X)X) but it is easy to see that this is positive (and thus we make more than the average E=0): In computing

E(f(X)X) = int f(x) x rho(x) dx

we can for x<0 overestimate f(x) by f(0) and for x>0 underestimate f(x) by f(0) and then conclude (unless rho(x) = delta(x) and we always spit out 0s)

E(f(X)X) > f(0)E(X) = 0

Thus we do better on the average than by deciding for one envelope or the other not taking into account the contents of the first one.

Not that the difference to the first puzzle is that this works for any rho(x), we did not have to assume some (non existent) uniform rho(x) and the effect does not go away as soon as a cut-off is introduced contrary to the other puzzle.

Friday, October 15, 2010

Is there a prisoner's dilemma in vaccination?

Recently, I have been thinking about vaccination strategies as I was confronted with opinions which I consider at least very risk inviting. Not to spill oil in the fire I will thus anonymize illnesses and use made up probabilities. But let me assure you that for the illness I have in mind the probabilities are such that the story is similar.

Let's consider illness X. For simplicity assume that if you meet somebody with that illness you'll have it yourself a bit later with 100% probability. If you have X then in 1 in 2000 cases you will develop complication C which is lethal. But C itself is not contagious.

Luckily, there exists a vaccination against X that is 100% effective, i.e. if vaccinated you are immune to X. But unfortunately, the veccination itself causes the deadly C in 1 in a million cases.

So, the question is: Should you get vaccinated?

Unfortunately, the answer is not clear: It depends on the probability that if not vaccinated you will run into somebody spreading X. If X is essentially eradicated there is no point in taking the vaccination risk but if X is common it is much safer to vaccinate.

The break even is obviously when that probability is 1 in 500. If it's less likely to meet somebody then it would be to your advantage not to vaccinate.

Unfortunately, the probability of meeting an X infected person depends on how well people are vaxinated: As X is so contagious, if the vaccination rate drops the probability of meeting somebody with X dramatically increases. That is, not vaccinating yourself might be profitable for you but if everybody follows the same strategy the vaccination rate might drop and the society as a whole will see many more cases of C.

If you assume in addition that your information is not perfect and you might be wrong in estimating the probabilities involved it is not clear to me to which fixed point this system evolves.

But it seems likely to me that there are situations where for the society as a whole it is much better if you get vaccinated even if this increases you personal risk of encountering C.

Opinions?

Tuesday, September 14, 2010

Tensor factor subalgebras question

After some serious arm-twisting performed by some TMP students I accepted to run a seminar on Foundations of Quantum Mechanics under the condition that it would be a no-nonsense class. This is in addition to the String Theory Lectures I have to teach as well.

After coming back from our vacation and workshop I found myself in the situation that I had much more fun doing some reading for the seminar (reviews on decoherence etc) than preparing for the string class (given that I have already twice taught Intro to Strings and that there are David Tong's wonderful lecture notes which give you the impression that you could take a few pages of those and be well prepared for class).

In the preparation, I came across a wonderful video of a lecture by a well known physicist (I will link this later as I might take part of this as a quiz for the first session. Let me just mention that it contains the clearest version of Bell's inequality that I a am aware of).

I am convinced that a lot of possible confusion about quantum physics together with locality (let me only mention the three letters E, P and R) comes from the fact that people confuse the roles of observables and states: Observables can be local and causality is built in by asking operators localised at space like distances to commute while states are always global objects. There is nothing like "the wave function of electron 1" or only in the approximation where you ignore all the other particles. You cannot use it when talking about correlations etc. But this is not bad, even in classical (statistical) physics, there are non-local correlations, like the colors of the socks on my two feet. The fact that in addition to correlations, there can be entanglement in the quantum theory does not change that.

Furthermore, I find it helpful to think (of course I did not come up with this approach) of the Hilbert space (and its wave functions) as a secondary object and take the observables as a starting point (and not derived as the operators acting on the wave functions). Those then are the elements of a (C*)-algebra and the Hilbert space only arises as a representation of that algebra. Stone and von Neumann for example then tell you that there is essentially a unique representation if the algebra is that of canonical commutation relations.

States are then functionals w that map each observable A to a complex number w(A) (interpreted as the expectation value). This linear function has to be normalised, w(1)=1 and positive meaning that for all A one has w(A^* A)>=0 (did I tell you that formuals are broken?). Then the GNS construction is similar to a highest weight representation: Using w and the algebra, one can construct a Hilbert space: As a vector space you can take the algebra. It is a representation after defining the action to be simply left multiplication. The scalar product of the elements A and B can be given by w(A^* B). Positivity of w tells you this is at least positive semi-definite. One can quotient out the zero-space to obtain something potitive definite and then employ some C*-magic to show that the action by left multiplication can be lifted to the quotient. I have suppressed some topological fine-print here like taking completions etc.

The states correspond in general to density matrices (or reducible representations) and as always can be convex combined as x w1 + (1-x) w2, the extremal states corresponding to irreducible representations.

In quantum information applications (as well as EPR and decoherence), one often starts with a Hilbert space that is a tensor product H = H1 x H2. Restricting attention to the first factor only corresponds to taking the partial trace over H2 and in general turns pure states on H into mixed states on H2. This has the taste of "averaging over all possible states of H2" but in the algebraic formulation if becomes clear that one is only restricting a state w to the subalgebra of operators of the form A1 x id where id is the identitiy on H2.

What I do not understand yet and where I am asking your help is the following: How does the splitting into tensor factors really work on the algbraic side? In particular, assume I have a C*-algebra C and a pure state w. Now I take some subalgebra C1 of C and obtain a new state w1 on C1 by restricting w to this subalgebra. What is the relation of the two Hilbert spaces H and H1 I obtain from the GNS construction on w and and w1 respectively? What is a sufficient condition on C1 that I can regard H1 as a tensor factor of H as above?

A necessary condition are obviously dimensions in the finite dimensional case: Here, the C*-algebras are just the complex matrix algebras of size n x n and the irreducible representation is on C^n. This is only a non-trivial tensor product if n is not prime. But nothing stops me for example to start with the big algebra being the 17x17 matrices and the subalgebra being those matrices that have the last row and column filled with zeros. But C^16 is definitely not a tensor-factor of C^17.

Back from silence

It has been very silent here recently (or not so recently) but there is no particular reason for this except that I have been busy with other things (including an update on my facebook relationship status) and small things have been posted to Twitter rather than the blog.

And if one is not constantly taking care of things they tend to degrade. So is this blog. What happened is that mathphys.jacobs-university.de, the computer that I have been using to host (background and other) images and the mimetex installation that was serving formulas for this blog as well as a number of other CGI scripts has died or at least is being turned off. Anyway, I have to relocate these things and I am still looking for a good solution. It should be a computer with a static, routed IP address on which I can install programs and in particular cgi-scripts of my liking. Here at LMU, this is probably not going to happen for reasons of security paranoia on the sysadmin side. In addition, mathphys was handling my email traffic, meaning that currently spam reaches my inbox and messages are threatened to be deleted by well meaning service providers. But this just means that the suffering is strong enough that I will be looking for a solution in the very near future. The solution will most likely be renting some virtual linux server. Suggestions in this direction would be more than welcome.

Not long ago, I have been attending the 40th incarnation of the Ahrenshoop Symposium once more organised by the Humboldt Uni crowd. This get together had a particularly interesting selection of talks many of which I really enjoyed. In particular I learned a lot and updated my options on F-Theory GUTs and AdS-Condensed Matter. Many thanks to the organisers! As you would expect, PDFs are online except for Sean's who gave a flip-chart talk (on four flip charts).

At that meeting I was asked what had happened to this blog and this post is supposed to be the answer to this question. I hope of course that more content will be here, soon. I was also asked to mention that it was Martin Rocek who got all the soap.

Wednesday, February 10, 2010

How to obtain a polymer Hilbert space

On Monday, I will be at HU Berlin to give a seminar on my loop cosmology paper (at 2pm in case you are interested and around). Preparing for that I came up with an even more elementary derivation of the polymer Hilbert space (without need to mention C*-algebras, the GNS-construction etc). Here it goes:

Let us do quantum mechanics on the line. That is, the operators we care about are x and p. But as you probably know, those (more precisely, operators with the commutation relation [x,p]=i) cannot be both bounded. Thus there problems of domains of definition and limits. One of the (well accepted) ways to get around this is to instead work with Weyl operators U(a)=\exp(iax) and V(b)=\exp(ibp). As those will be unitary, they have norm 1 and the canonical commutation relations read (with the help of B, C and H) U(a)V(b)=V(b)U(a)e^{iab}. If you later want, you can go back to x=dU(a)/da|_{a=0} and similar for p.

Our goal is to come up with a Hilbert space where these operators act. In addition, we want to define a scalar product on that space such that U and V act as unitary operators preserving this scalar product. We will deal with the position representation, that is wave functions \psi(x). U and V then act in the usual way, V(b) by translation (V(b)\psi)(x)=\psi(x-b) and U(a) by multiplication (U(a)\psi)(x)=e^{iax}\psi(x). Obviously, these fulfil the commutation relation. You can think of U and V as the group elements of the Heisenberg group while x and p are in the Lie algebra.

Here now comes the only deviation from the usual path (all the rest then follows): We argue (motivated by similar arguments in the loopy context) that since motion on the real line is invariant under translation (at least until we specify a Hamiltonian) is invariant under translations, we should have a state in the Hilbert space which has this symmetry. Thus we declare the constant wave function |1\rangle=\psi(x)=1 to be an element of the Hilbert space and we can assume that it is normalised, i.e. \langle 1|1\rangle=1.

Acting now with U(a), we find that linear combinations of plane waves e^{ikx} are then as well in the Hilbert space. By unitarity of U(a), it follows that \langle e^{ikx}| e^{ikx}\rangle =1, too. It remains to determine the scalar product of two different plane waves \langle e^{ikx}|e^{ilx}\rangle. This is found using the unitarity of V and sesquilinearity of the scalar product: \langle e^{ikx}|e^{ilx}\rangle = \langle V(b) e^{ikx}|V(b)e^{ilx}\rangle = e^{ib(l-k)}\langle e^{ikx}|e^{ilx}\rangle. This has to hold for all b and thus if k\ne l it follows that the scalar product vanishes.

Thus we have found our (polymer) Hilbert space: It is the space of (square summable) linear combinatios of plane waves with a scalar product such that the e^{ikx} are an orthonormal basis.

Now, what about x and p? It is easy to see that p when defined by a derivative as above acts in the usual way, that is on a basis element pe^{ikx}=ke^{ikx} which is unbounded as k can be arbitrarily large. The price for having plane waves as normalisable wave functions is, however, that x is not defined: It would be xe^{ikx} = \lim_{\epsilon\to 0}\frac{e^{i(k+\epsilon}x}-e^{ikx}}{\epsilon}. But for \epsilon\ne 0 the two exponentials in the denominator are always orthogonal and thus not "close" as measured by the norm. The denominator always has norm 2 and thus the limit is divergent. Another way to see this is to notice that x would of course act as multiplication by the coordinate x, but x times a plane wave is no longer a linear combination of plane waves.

To make contact with loop cosmology one just has to rename the variables: What I called p for a simplicity of presentaion is the volume element v in loop cosmology while the role of x is played be the conjugate momentum \beta.

If you want you can find my notes for the blackboard talk at HU here (pdf or djvu

Wednesday, January 27, 2010

Entropic Everything

The latest paper by Eric Verlinde on gravity as an entropic force makes me wonder whether I am getting old: Let me admit it: I just don't get it. Is this because I am conservative or lack imagination or too narrow minded? If it were not for the author, I would have rated it as pure crackpottery. But maybe I am missing something. Today, there were three follow-up papers dealing with cosmological consequences (the idea being roughly that Verlinde uses the equipartition of energy between degrees of freedom each getting a share of 1/2 kT which is not true quantum mechanically at low temperatures as there the system is in the ground state with the ground state energy. As in this business temperature equals acceleration a la Unruh this means the argument is modified for small accelerations which is a modification of MOND type).

Maybe later I try once more to get into the details and might have some more sensible comments then but right now the way different equations from all kinds of different settings (Unruh temperature was already mentioned, E=mc^2, one bit per Planck area, etc) are assembled reminds me of this:

Tuesday, January 19, 2010

Instability of the QED vacuum at large fine structure constant

Today, in the "Mathematical Quantum Mechnics" lecture, I learned that the QED vacuum (or at least the quantum mechanical sector of it) is unstable when the fine structure constant gets too big.

To explain this, let's go back to a much simpler problem: Why is the hydrogen-like atom stable? Well, a simple answer is that you just solve it and find the spectrum to be bounded above -13.6Z^2{\rm eV}. But this answer does not extend to other problems that cannot be diagonalised analytically.

First of all, what is the problem we are considering? It's the potential energy of the electron which in natural (for atomic physics) units is V(r)=-\alpha Z/r. And this goes to negative infinity when r goes to 0. But quantum mechanics saves you. Roughly speaking (this argument can be made mathematically sound in terms of Hardy's inequality), if you essentially localise the electron in a ball of radius R and thus have the potential energy V\le-\alpha Z/R, Heisenberg's uncertainty implies the momentun is at least of the order 1/R and thus the kinetic energy is at least of the order +1/R^2. Thus, when R becomes small and you seem to approach the throat of the potential the positive kinetic energy wins and thus the Hamiltonian of the hydrogen atom is bounded from below. This is the non-relativistic story.

Close to the nucleus however, the momentum can be so big that you have to think relativistically. But then trouble starts as at large momenta the energy grows only linearly with momentum and thus the kinetic energy only scales like +1/R which is the same as the potential energy. Thus a more careful calculation is needed. The result of it is that it depends on \alpha Z which term eventually wins. Above a critical value (which happens to be of order one) the atom is unstable and one can gain an infinite amount of energy by lowering the electron into the nucleus and quantum mechanics is not going to help.

Luckily, nuclei with large enough Z do not exist in nature. Well, with the exception of neutron stars which are effectively large nuclei. And there it happens. All the electrons are sucked into the nuceus and fuse with the protons to neutrons. In fact, the finite size of the nucleon is what regulates this process as the 1/r nature of the Coulomb potential is smeared out in the nucleus. But such a highly charged atom would be only of the size of the nucleus (about some femto meters) rather than the size of typical atoms.

But now comes QED with the possibility of forming electron-positron pairs out of the vacuum. The danger I am talking about is the fact that they can form a relativistic, hydrogen like bound state. And both are (as far as we know) point like and thus there is no smearing out of the charge. It is only that \alpha Z\approx 1/137 in this case which luckily is less than one. If it would be bigger you could create this infinte amount of energy from the vacuum by pair creation and bringing them on-shell in their relative Coulomb throat. What a scary thought. Especially, since \alpha is probably only the vev of some scalar field which can take other values in other parts of the multiverse which would then disappear with a loud bang.

Some things come to my mind which in principle could help but which turn out to make things worse: \alpha is not a constant but it's running and QED has asymptotic slavery which means at short distances (which we are talking about) it gets bigger and makes things worse. Further, we are treating the electromagnetic field classically which of course is not correct. But my mathematical friends tell me that quantizing it also worsens things.

We know, QED has other problems like the Landau pole (a finite scale where \alpha goes to infinity due to quantum effects). But it seems to me that this is a different problem since it already appears at \alpha\approx 1.

Any ideas or comments?