k$ the fraction $\Bover{k}{m}$ can be expressed as $\Bover1{m_1}+\Bover1{m_2}+\Bover1{m_3}+\ldots+\Bover1{m_r}$, where $r\ge 1$ and $m_1,\ldots,m_r$ are distinct integers greater than $0$. \hfill P($k$)} \noindent The start is now very easy. {\bf Base step} P(1) says just that \inset{whenever $m>1$ the fraction $\Bover{1}{m}$ can be expressed as $\Bover1{m_1}+\Bover1{m_2}+\Bover1{m_3}+\ldots+\Bover1{m_r}$, where $r\ge 1$ and $m_1,\ldots,m_r$ are distinct integers greater than $0$,} \noindent and this is certainly OK; all we have to do is to take $r=1$ and $m_1=m$, and we just have $\Bover1m=\Bover1m$, which is true. The interesting part is in the {\bf Inductive step} Suppose that $k>1$ and that P($i$) is true whenever $1\le ik$ and consider the fraction $\Bover{k}{m}$. Where can we start? The key is: {\it be greedy}. We want to find some numbers adding up to $\Bover{k}{m}$; well, start with the biggest number we're allowed. It's got to be a reciprocal $\Bover1{m_1}$; now you make this big by making $m_1$ small, so the rule is \inset{take the smallest $m_1$ such that $\Bover1{m_1}\le\Bover{k}m$.} \noindent Now what do we know about $m_1$? First, we have $\Bover1{m_1}\le\Bover{k}m$, so $m\le km_1$. On the other hand, we know that we weren't allowed to use $m_1-1$, so $\Bover1{m_1-1}>\Bover{k}{m}$, and $m>km_1-k$. Now look at what's left. We have \Centerline{$\Bover{k}m-\Bover1{m_1}=\Bover{km_1-m}{mm_1}$.} \noindent In this fraction, the denominator has got worse ($mm_1$ is going to be bigger than $m$), but the numerator is $km_1-m$, and we know enough to locate this. Since $km_1-k 1$, and all the $m_j$ are different from $m_1$ as well as from each other. Thus we can express $\Bover{k}m$ as a sum of distinct reciprocals. As $m$ is arbitrary, P($k$) is true and the induction proceeds. \fi %end of Mathematicians \vfill\eject \Centerline{{\bf Appendix} (with some repetitions)} \filename{ma120.tex} \versiondate{18.12.01} Recall the definition of the Fibonacci sequence: \Centerline{$a_0=0$, $a_1=1$, $a_n=a_{n-1}+a_{n-2}$ for every $n\ge 2$.} \noindent Surprisingly, we have a straightforward formula: $$a_n = \bover{1}{\sqrt 5}\biggl((\bover{1 + \sqrt 5}{2})^n - (\bover{1 - \sqrt 5}{2})^n\biggr) $$ \noindent (What is surprising about this is that all the $\sqrt 5$s somehow cancel out to give integer values for every $n$.) To check this, express the formula as $$\eqalignno{a_n &= \bover{\alpha^n - \beta^n}{\sqrt 5}\cr}$$ \noindent where $\alpha = \Bover{1 + \sqrt 5}{2}$ and $\beta = \Bover{1 -\sqrt 5}{2}$. Now the reason it works is that $\alpha$ and $\beta$ have the following special property: $$\eqalign{\alpha^2 &=\bigl(\bover{1+\sqrt 5}2\bigr)^2\cr &=\bover{1+2\sqrt 5+5}{4}\cr &=\bover{4+2(1+\sqrt 5)}{4}=1+\alpha,\cr}$$ \noindent while $$\eqalign{\alpha^2 &=\bigl(\bover{1-\sqrt 5}2\bigr)^2\cr &=\bover{1-2\sqrt 5+5}{4}\cr &=\bover{4-2(1+\sqrt 5)}{4}=1+\beta,\cr}$$ \noindent Now we can look at the things we're trying to prove: $$\eqalignno{a_n&=\Bover1{\sqrt 5}(\alpha^n-\beta^n)&\text{P}(n)}$$ {\bf Base step} P(0) says $$a_0=\Bover1{\sqrt 5}{\alpha^0-\beta^0}$$ \noindent which is true, because $$\Bover1{\sqrt 5}(\alpha^0-\beta^0)=\Bover1{\sqrt 5}(1-1)=0=a_0.$$ \noindent P(1) says $$a_1=\Bover1{\sqrt 5}(\alpha^1-\beta^1)$$ \noindent which is true, because $$\eqalign{\bover1{\sqrt 5}(\alpha^1-\beta^1) &=\bover{\alpha-\beta}{\sqrt 5}\cr &=\bover{\Bover{1+\sqrt 5}2-\Bover{1-\sqrt 5}2}{\sqrt 5}\cr &=\Bover{2\sqrt 5}{2\sqrt 5} =1=a_1.\cr}$$ {\bf Inductive step} Assuming that P($n-1$) and P($n-2$) are true, we set out to prove P($n$), that is, that $a_n=\Bover1{\sqrt 5}(\alpha^n-\beta^n)$. Now our defining formula for $a_n$ says that it's $a_{n-1}+a_{n-2}$, so $$\eqalignno{a_n &=a_{n-1}+a_{n-2}\cr &=\Bover1{\sqrt 5}(\alpha^{n-1}-\beta^{n-1}) +\Bover1{\sqrt 5}(\alpha^{n-2}-\beta^{n-2})\cr \noalign{\noindent (because P($n-1$) and P($n-2$) are both supposed to be true)} &=\Bover1{\sqrt 5}(\alpha^{n-1}-\beta^{n-1}+\alpha^{n-2}-\beta^{n-2})\cr &=\Bover1{\sqrt 5}(\alpha^{n-1}+\alpha^{n-2} -(\beta^{n-1}+\beta^{n-2}))\cr &=\Bover1{\sqrt 5}(\alpha^{n-2}(\alpha+1)-\beta^{n-2}(\beta+1))\cr &=\Bover1{\sqrt 5}(\alpha^{n-2}\alpha^2-\beta^{n-2}\beta^2)\cr \noalign{\noindent (because $\alpha^2=\alpha+1$, $\beta^2=\beta+1$)} &=\Bover1{\sqrt 5}(\alpha^n-\beta^n),\cr}$$ \noindent and $a_n=\Bover1{\sqrt 5}(\alpha^n-\beta^n)$, which is what we needed to prove. This works for every $n\ge 2$, so the induction continues. \medskip I hope you noticed that I was using a slightly different kind of induction here. Instead of saying \inset{P($0$) is true and P($n+1$) is true whenever P($n$) is true, so P($n$) must be true for all $n\ge 0$,} \noindent I said \inset{P($0$) and P($1$) are true and P($n$) is true whenever P($n-1$) and P($n-2$) are both true, so P($n$) must be true for all $n\ge 0$.} \noindent This is still OK, because we can still work up the ladder: \inset{P($0$) is true,} \inset{P($1$) is true,} \inset{because P($0$) and P($1$) are both true, P($2$) is true,} \inset{because P($1$) and P($2$) are both true, P($3$) is true,} \inset{because P($2$) and P($3$) are both true, P($4$) is true,} \inset{because P($3$) and P($4$) are both true, P($5$) is true,} \inset{because P($4$) and P($5$) are both true, P($6$) is true,} \inset{$\ldots$;} \noindent the process still takes us as far as we like. If you think of induction as climbing a ladder, the {\it base step} is a machine to get you onto the bottom rung, and the {\it inductive step} is a machine to lift you from one rung to the next. To use the First Principle of Induction we need a proof that P($n$)$\Rightarrow$P($n+1$); that is, a machine which will lift us from level $n$ to level $n+1$. In the argument just above, the machine needed to rest on both the preceding levels; it was a proof that P($n-2$) and P($n-1$) together implied P($n$). So of course the base step had to check both P($0$) and P($1$), because our machine couldn't lift us from level $0$ to level $1$. But resting on levels $0$ and $1$, it could raise us to level $2$; resting on levels $1$ and $2$, it could raise us to level $3$; and so on. It would still take us to any point we needed to reach. Now if you think about it, an inductive argument can use much more elaborate machinery if that seems to help. The point about a proof by induction is that we look at the propositions P($0$), P($1$), P($2$), $\ldots$ in order, and don't even think about proving P($7$) until we've done all the preceding ones. So (if it seems to be useful) when designing the machine to reach level $n$, we can assume that P($i$) is true for every $i 0$ and $P(i)$ is true whenever $0\le i 2$ and that P($i$) is true whenever $2\le i 2$, P($n$) is true for every $n\ge 2$. \medskip For another example of the use of the Second Principle of Induction, I take another fundamental fact about whole numbers. \medskip {\bf Theorem} If $p$, $q\in\Bbb P$, there are $r$, $s\in\Bbb Z$ such that $rp+sq$ is a factor of both $p$ and $q$. \medskip {\bf proof} Actually the first question to sort out here is: what are we supposed to induce {\it on}? Both the First and Second principles insist on starting with a proposition of the form P($n$). Here we have one which talks about two integers $p$ and $q$. Will it work if we call one of them $n$? is there anything else we can do? There is a fairly standard approach to this. The idea of proofs by induction is that you do the `smallest' numbers first. So presumably we should start with the cases $(p,q)=(1,1)$, $(1,2)$, $(2,1)$, $(2,2)$, $(1,3)$, $(3,1)$ (or perhaps $(1,3)$ should come before $(2,2)$). What we mean by saying that `the pair $p$, $q$ is small' might be something like `$p+q$ is small' or `the larger of $p$ and $q$ is small'. Actually it very often doesn't matter which of these rules we take; but my own prejudice is in favour of taking `the larger of $p$ and $q$' as my measure of the size of the problem. So let us try, for the statement P($n$), the statement \inset{if $p$, $q\in\Bbb P$ and $\max(p,q)=n$, then there are integers $r$ and $s$ such that $rp+sq$ is a factor of both $p$ and $q$,} \noindent and see how we get on\footnote{For the rest of this argument, I will write `$\max(p,q)$' for `the larger of $p$ and $q$'; if $p=q$, then $\max(p,q)=p=q$.}. {\bf Base step} If $n=1$, then P(1) is \inset{if $p$, $q\in\Bbb P$ and $\max(p,q)=1$, then there are integers $r$ and $s$ such that $rp+sq$ is a factor of both $p$ and $q$,} \noindent But this is certainly true, because if $\max(p,q)=1$ then we can only have $p=q=1$, in which case we can set $r=1$, $s=0$ and get $rp+sq=1$, which is surely a factor of both $p$ and $q$. (You will see that I am starting from the case $n=1$ rather than the case $n=0$. Actually $P(0)$ is easier, because there are {\it no} $p$, $q\in\Bbb P$ such that $\max(p,q)=0$! so there is nothing to check.) {\bf Inductive step} Now suppose that $k>1$, that $P(i)$ is true whenever $1\le i q$. The middle one is the simplest, so let's take that first: \inset{{\bf case 1} If $p=q$, then set $r=1$, $s=0$, so that $rp+js=p$ is a factor of both $p$ and $q$ (just as in the case $p=q=1$).} \noindent The other cases are virtually identical to each other. Let's take \inset{{\bf case 2} If $p k$ the fraction $\Bover{k}{m}$ can be expressed as $\Bover1{m_1}+\Bover1{m_2}+\Bover1{m_3}+\ldots+\Bover1{m_r}$, where $r\ge 1$ and $m_1,\ldots,m_r$ are distinct integers greater than $0$. \hfill P($k$)} \noindent The start is now very easy. {\bf Base step} P(1) says just that \inset{whenever $m>1$ the fraction $\Bover{1}{m}$ can be expressed as $\Bover1{m_1}+\Bover1{m_2}+\Bover1{m_3}+\ldots+\Bover1{m_r}$, where $r\ge 1$ and $m_1,\ldots,m_r$ are distinct integers greater than $0$,} \noindent and this is certainly true; all we have to do is to take $r=1$ and $m_1=m$, and we just have $\Bover1m=\Bover1m$, which is true. The interesting part is in the {\bf Inductive step} Suppose that $k>1$ and that P($i$) is true whenever $1\le ik$ and consider the fraction $\Bover{k}{m}$. Where can we start? The key is: {\it be greedy}. We want to find some numbers adding up to $\Bover{k}{m}$; well, start with the biggest number we're allowed. It's got to be a reciprocal $\Bover1{m_1}$; now you make this big by making $m_1$ small, so the rule is \inset{take the smallest $m_1$ such that $\Bover1{m_1}\le\Bover{k}m$.} \noindent Now what do we know about $m_1$? First, we have $\Bover1{m_1}\le\Bover{k}m$, so $m\le km_1$. On the other hand, we know that we weren't allowed to use $m_1-1$, so $\Bover1{m_1-1}>\Bover{k}{m}$, and $m>km_1-k$. Now look at what's left. We have \Centerline{$\Bover{k}m-\Bover1{m_1}=\Bover{km_1-m}{mm_1}$.} \noindent In this fraction, the denominator has got worse ($mm_1$ is going to be bigger than $m$), but the numerator is $km_1-m$, and we know enough to locate this. Since $km_1-k 1$, and all the $m_j$ are different from $m_1$ as well as from each other. Thus we can express $\Bover{k}m$ as a sum of distinct reciprocals. As $m$ is arbitrary, P($k$) is true and the induction proceeds. \medskip {\bf Remark} In fact we can do a bit better; the result is true for any rational number greater than 0, not just those between 0 and 1. To see this, we need to know the following, which is very important in itself. \medskip {\bf Theorem} For any $n\in\Bbb N$, \Centerline{$1+\Bover12+\Bover13+\Bover14+\ldots+\Bover1{2^n} \ge 1+\Bover{n}2$.} \medskip \noindent{\bf proof} Induce on $n$, using the First Principle. {\bf Base step} If $n=0$, then we are saying that \Centerline{$1\ge 1+\Bover{0}2$,} \noindent which is certainly true. {\bf Inductive step} Assuming that \Centerline{$1+\Bover12+\Bover13+\ldots+\Bover1{2^n} \ge 1+\Bover{n}2$} \noindent we want to prove that \Centerline{$1+\Bover12+\Bover13+\ldots+\Bover1{2^{n+1}} \ge 1+\Bover{n+1}2$.} \noindent But $$\eqalignno{1+\Bover12+\Bover13+\ldots+\Bover1{2^{n+1}} &=1+\Bover12+\Bover13+\ldots+\Bover1{2^n} +\Bover1{2^n+1}+\ldots\Bover1{2^{n+1}}\cr &\ge 1+\Bover{n}2+\Bover1{2^n+1}+\Bover1{2^n+2}+\ldots +\Bover1{2^{n+1}-1}+\Bover1{2^{n+1}}\cr \displaycause{by the inductive hypothesis} &\ge 1+\Bover{n}2+\Bover1{2^{n+1}}+\ldots\Bover1{2^{n+1}}\cr \displaycause{because $\Bover1k\ge\Bover1{2^{n+1}}$ whenever $2^n 0$ be rational. If $a<1$ we already know that it can be expressed as a sum of distinct rationals. If $a\ge 1$, consider $1$, $1+\Bover12$, $1+\Bover12+\Bover13$, $\ldots$. We know that eventually these sums have to go past $a$; if $n\ge 2(a-1)$, then \Centerline{$1+\Bover12+\ldots+\Bover1{2^n}\ge 1+\Bover{n}2\ge a$.} \noindent So we must be able to find some $k$ such that \Centerline{$1+\Bover12+\Bover13+\ldots+\Bover1k\le a <1+\Bover12+\Bover13+\ldots+\Bover1k+\Bover1{k+1}$.} \noindent If $a=1+\Bover12+\ldots+\Bover1k$ we can stop. Otherwise, set $b=a-1-\Bover12-\ldots-\Bover1k$. Then $b$ is a rational number strictly between $0$ and $\Bover1{k+1}$. By the theorem above, we can express $b$ as $\Bover1{m_1}+\ldots+\Bover1{m_r}$ where $m_1,\ldots,m_r$ are all different, so that \Centerline{$a =1+\Bover12+\ldots+\Bover1k+\Bover1{m_1}+\ldots+\Bover1{m_r}$.} \noindent Also, for $1\le i\le r$, $\Bover1{m_i}\le b<\Bover1{k+1}$ so $m_i>k+1$; thus all the numbers $1,2,\ldots,k,m_1,\ldots,m_r$ are different and we have expressed $a$ as a sum of distinct reciprocals. \bigskip {\bf The Marriage Lemma} For my last example of a proof by induction, I will try to explain one of the most important theorems of `combinatorics' -- the branch of mathematics which deals with patterns and counting. First, a bit of special notation. If $R$ is a relation (that is, a set of ordered pairs), and $I$ is a set, I will write $R[I]=\{y:\,\Exists x\in I,\,(x,y)\in R\}$ for the set of objects related to some member of $I$ by the relation $R$. Next, we need to know what an `injective' function is. A function $f$ is {\bf injective} or {\bf one-to-one} if \inset{it never takes the same value twice} \inset{the equation $f(x)=y$ never has more than one solution for $x$} \inset{if $x$, $x'$ are different points in the domain of $f$, then $f(x)\ne f(x')$} \inset{if $x$, $x'\in\dom f$ and $f(x)=f(x')$ then $x=x'$;} \noindent all these four expressions are supposed to be saying the same thing. In terms of the familiar functions of calculus, \Centerline{$f(x)=\exp x$ for $x\in\Bbb R$,} \Centerline{$f(x)=\ln x$ for $x>0$,} \Centerline{$f(x)=x^3$ for $x\in\Bbb R$,} \Centerline{$f(x)=\sqrt x$ for $x\ge 0$} \noindent are all injective, but \Centerline{$f(x)=x^2$ for $x\in\Bbb R$,} \Centerline{$f(x)=\sin x$ for $x\in\Bbb R$} \noindent are not ($(-1)^2=1^2$, $\sin 0=\sin\pi$). In terms of arrow diagrams, \picture{cc1118d}{141.75pt} \noindent and \picture{cc1118f}{141.75pt} \noindent represent injective functions, but \picture{cc1118e}{141.75pt} \noindent does not. Now for the theorem. \medskip {\bf Hall's Marriage Lemma} Suppose that $X$ is a finite set, $Y$ is any set, and $R\subseteq X\times Y$ is a relation such that $\#(R[I])\ge\#(I)$ for every $I\subseteq X$. Then there is an injective function $f:X\to Y$ such that $(x,f(x))\in R$ for every $x\in X$. \medskip \noindent{\bf Remarks} Before starting on the proof, let me explain why this is called the `marriage lemma'. Imagine that $X$ is a set of girls, $Y$ is a set of boys and that $R$ is the set of pairs $(x,y)$ who are willing to dance together. A master of ceremonies is trying to pair them off; that is, to find a function $f$ so that $f(x)$ will be the partner of $x$. Every girl is supposed to get a partner (that is, $\dom f$ should be $X$); every girl should have a different partner (that is, $f$ should be injective); and all pairs should be acceptable to each other (that is, $(x,f(x))$ should belong to $R$ for every $x$). The Marriage Lemma gives a rule to decide when there will be such a pairing. Generally, the Marriage Lemma gives us a rule for deciding when allocation problems are soluble. For almost any kind of finite allocation of discrete objects it is the first thing to look at. \medskip \noindent{Proof of the Marriage Lemma} Induce on $\#(X)$, using the Second Principle of Induction. \medskip {\bf Base step} If $\#(X)=0$, that is, $X=\emptyset$, the problem is trivial. We have no girls to find partners for and don't have to do anything. Technically speaking, $R=\emptyset$ and the function $f$ will have to be the `empty function', the function with domain $\emptyset$ and no values. \medskip {\bf Inductive step} Suppose that the lemma is true whenever $0\le\#(X) 0$, and that we are given $X$, $Y$ and $R$ such that $\#(X)=n$, $R\subseteq X\times Y$ and $\#(R[I])\ge\#(I)$ for every $I\subseteq X$; we are looking for a suitable function from $X$ to $Y$. \medskip {\bf Case 1} Suppose that there is a set $I\subseteq X$ such that $0<\#(I)<\#(X)$ and $\#(R[I])=\#(I)$. In this case, set \Centerline{$X_1=I$, \quad$R_1=R\cap(I\times Y)$,} \Centerline{$X_2=X\setminus I$, \quad$R_2=R\cap((X\setminus I)\times(Y\setminus R[I]))$.} \noindent Then $R_1\subseteq X_1\times Y$. If $J\subseteq X_1$, $R_1[J]=R[J]$ so $\#(R_1[J])=\#(R[J])\ge\#(J)$. Also $\#(X_1)<\#(X)=n$; by the inductive hypothesis, there is an injective function $f_1:X_1\to Y$ such that $(x,f_1(x))\in R_1$ for every $x\in X_1$. Next, $R_2\subseteq X_2\times Y$. If $J\subseteq X_2$, then \Centerline{$R_2[J]=R[J]\setminus R[I]=(R[I]\cup R[J])\setminus R[I] =R[I\cup J]\setminus R[I]$,} \noindent so $$\eqalignno{\#(R_2[J]) &=\#(R[I\cup J])-\#(R[I])\ge\#(I\cup J)-\#(I)=\#(I\cup J)-\#(I)\cr \displaycause{by the choice of $I$} &=\#(J).\cr}$$ \noindent Also $\#(X_2)<\#(X)=n$; by the inductive hypothesis again, there is an injective function $f_2:X_2\to Y$ such that $(x,f_2(x))\in R_2$ for every $x\in X_2$. Now define $f:X\to Y$ by setting $$\eqalign{f(x)&=f_1(x)\text{ if }x\in X_1\cr &=f_2(x)\text{ if }x\in X_2.\cr}$$ \noindent If $x\in X_1$, then \Centerline{$(x,f(x))=(x,f_1(x))\in R_1\subseteq R$;} \noindent if $x\in X_2$, then \Centerline{$(x,f(x))=(x,f_2(x))\in R_2\subseteq R$.} \noindent So $(x,f(x))\in R$ for every $x\in X$. If $x$, $x'\in X$ are distinct, then \inset{if $x$, $x'\in X_1$, $f(x)=f_1(x)\ne f_1(x')=f(x')$ because $f_1$ is injective,} \inset{if $x$, $x'\in X_2$, $f(x)=f_2(x)\ne f_2(x')=f(x')$ because $f_2$ is injective,} \inset{if $x\in X_1$ and $x'\in X_2$, $(x,f(x))\in R$ and $x\in I$ so $f(x)\in R[I]$, while \Centerline{$(x',f(x')) =(x',f_2(x'))\in R_2\subseteq (X\setminus I)\times(Y\setminus R[I])$} \noindent so $f(x')\notin R[I]$, and $f(x')$ cannot be equal to $f(x)$,} \inset{if $x\in X_2$ and $x'\in X_1$ then $f(x')\in R[I]$ while $f(x)\notin R[I]$, so $f(x)\ne f(x')$.} \noindent Thus $f(x)\ne f(x')$ in all cases, and $f$ is injective. So we have a function of the kind we need, at least when there is an $I\subseteq X$, neither $\emptyset$ nor $X$, such that $\#(R[I])=\#(I)$. \medskip {\bf Case 2} Suppose that $\#(R[I])\ne\#(I)$ whenever $I\subseteq X$ and $0<\#(I)<\#(X)$; that is, $\#(R[I])>\#(I)$ for every such $I$. We are supposing that $\#(X)=n>0$, so $X\ne\emptyset$; take any $x_0\in X$. Since $\#(R[\{x_0\}])\ge\#(\{x_0\})=1$, $R([\{x_0\}])$ is not empty; take any $y_0\in R[\{x_0\}]$, so that $(x_0,y_0)\in R$. Now set $X_1=X\setminus\{x_0\}$ and set \Centerline{$R_1=R\cap((X\setminus\{x_0\})\times(Y\setminus\{y_0\}))$.} \noindent Then $R_1\subseteq X_1\times Y$. If $J\subseteq X_1$ and $J\ne\emptyset$, then $0<\#(J)<\#(X)$ so $\#(R[J])>\#(J)$. But $R_1[J]\supseteq R[J]\setminus\{y_0\}$, so \Centerline{$\#(R_1[J])\ge\#(R[J])-1\ge\#(J)$.} \noindent If $J=\emptyset$ then of course $\#(R_1[J])\ge 0=\#(J)$. Also $\#(X_1)<\#(X)=n$. By the inductive hypothesis yet again, there is an injective function $f_1:X\to Y$ such that $(x,f_1(x))\in R_1$ for every $x\in X_1$. Define $f:X\to Y$ by setting $$\eqalign{f(x)&=y_0\text{ if }x=x_0\cr &=f_1(x)\text{ if }x\in X_1.\cr}$$ \noindent If $x=x_0$, then \Centerline{$(x,f(x))=(x_0,y_0)\in R$;} \noindent if $x\in X_1$, then \Centerline{$(x,f(x))=(x,f_1(x))\in R_1\subseteq R$.} \noindent So $(x,f(x))\in R$ for every $x\in X$. If $x$, $x'\in X$ are distinct, then \inset{if $x$, $x'\in X_1$, $f(x)=f_1(x)\ne f_1(x')=f(x')$ because $f_1$ is injective,} \inset{if $x=x_0$ and $x'\in X_1$, $f(x_0)=y_0$, while \Centerline{$(x',f(x')) =(x',f_1(x'))\in R_1 \subseteq (X\setminus\{x_0\})\times(Y\setminus\{y_0\}$} \noindent so $f(x')\ne y_0$, and $f(x')$ cannot be equal to $f(x)$,} \inset{if $x\in X_1$ and $x'=x_0$ then $f(x')=y_0\ne f(x)$.} \noindent Thus $f(x)\ne f(x')$ in all cases, and $f$ is injective. So we have a function of the kind we need in this case also. Thus the induction proceeds. \medskip {\bf Remark} The above proof is complete and correct; I do not think it is `easy', but I hope you can see the principle involved; if there is a critical set $I$ such that $R[I]$ is exactly the same size as $I$, then the whole of $R[I]$ will have to be used up in allocating the members of $I$, and the rest of $X$ will have to make do with members of $Y\setminus R[I]$; `case 1' of the inductive step is just the check that in fact they will be able to manage. If there isn't a critical set, then we can get away with making the first assignment $f(x_0)$ without looking at anyone else, and the rest will again be able to make do with what's left. I ought however to tell you that for many purposes this is not the `best' proof. The trouble is that it gives you no indication of how to actually make the allocation. If $X$ has 1000 members, say, the proof seems to ask you to look at every one of the $2^{1000}$ subsets $I$ of $X$ to see if one of them has $\#(R[I])=\#(I)$. This will forever be completely impracticable. There is however a quite different (and more difficult) proof of the theorem which provides an algorithm for finding the injective function which, on currently available computers, could be made to work so long as $\#(R)$ is not too enormous -- $\#(R)=10000$ shouldn't be beyond a fast PC with a patient customer. \end \end