\documentclass[12pt,reqno]{article}
\usepackage[usenames]{color}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage{amscd}
\usepackage[colorlinks=true,
linkcolor=webgreen,
filecolor=webbrown,
citecolor=webgreen]{hyperref}
\definecolor{webgreen}{rgb}{0,.5,0}
\definecolor{webbrown}{rgb}{.6,0,0}
\usepackage{color}
\usepackage{fullpage}
\usepackage{float}
\usepackage{psfig}
\usepackage{graphics,amsmath,amssymb}
\usepackage{amsthm}
\usepackage{amsfonts}
\usepackage{latexsym}
\usepackage{epsf}
\setlength{\textwidth}{6.5in}
\setlength{\oddsidemargin}{.1in}
\setlength{\evensidemargin}{.1in}
\setlength{\topmargin}{-.5in}
\setlength{\textheight}{8.9in}
\newcommand{\seqnum}[1]{\href{http://oeis.org/#1}{\underline{#1}}}
\begin{document}
\begin{center}
\epsfxsize=4in
\leavevmode\epsffile{logo129.eps}
\end{center}
\theoremstyle{plain}
\newtheorem{theorem}{Theorem}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{proposition}[theorem]{Proposition}
\theoremstyle{definition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{example}[theorem]{Example}
\newtheorem{conjecture}[theorem]{Conjecture}
\theoremstyle{remark}
\newtheorem{remark}[theorem]{Remark}
\begin{center}
\vskip 1cm{\LARGE\bf Expectations of Family Sizes Subject to\\
\vskip .08 in
Minimum Numbers of Each Gender}
\vskip 1cm
\large
Martin Griffiths\\
Mathematical Institute\\
University of Oxford\\
Oxford OX1 3LB\\
United Kingdom\\
\href{mailto:martin.griffiths@maths.ox.ac.uk}{\tt martin.griffiths@maths.ox.ac.uk} \\
\ \\
Alexander Bramham\\
Wadham College\\
University of Oxford\\
Oxford OX1 3PN\\
United Kingdom\\
\href{mailto:alexander.bramham@wadh.ox.ac.uk}{\tt alexander.bramham@wadh.ox.ac.uk}
\end{center}
\vskip .2 in
\begin{abstract}
We consider the infinite families of sequences that arise from a
scenario involving the expected number of children born to a married
couple when certain conditions are placed on the minimum number of each
gender. We obtain both exact and asymptotic formulas for these
expectations, and go on to consider the long-term behavior of sequences
derived from them. For the situations studied here, it is shown that
the long-term behavior of each of the resulting sequences always falls
into one of two distinct categories.
\end{abstract}
\section{Introduction}
Suppose that a married couple plan to keep having children until the first instance at which they have at least one boy and one girl. Assuming that boys and girls are equally likely to be born, and excluding the possibility
of multiple births, it is a straightforward calculation to show that that the expected number of children in such a family is 3. In \cite{griffiths2} we considered a generalization of this scenario, and derived an expression for the expected number of children our couple would need to have in order that their family contains at least $n$ boys and $n$ girls.
In this article we generalize things yet further by studying the considerably more complicated scenario in which the parents keep having children until the first instance at which they have at least $m$ boys and $n$ girls. It is assumed throughout that only one child is born at a time (no twins, triplets, and so on) and that on the birth of each child we have $\mathbb{P}[\text{boy}]=\mathbb{P}[\text{girl}]=\frac{1}{2}$. Under these conditions, we obtain expressions for the expected number of children that would result.
Our expressions give rise to infinite families of integer sequences, and we consider here the contrasting long-term behavior of these sequences under various different scenarios. Indeed, the following definitions enable us to categorize these behaviors:
\begin{definition}
A sequence of integers $\left(x_n\right)_{n\geq1}$ is termed \textit{linear} if there exists $a,b\in\mathbb{Z}$ such that $x_n=an+b$ for all $n\in \mathbb{N}$.
\end{definition}
\begin{definition}
A sequence of integers $\left(y_n\right)_{n\geq1}$ is termed \textit{eventually-linear} if there exists some $N\in\mathbb{N}$ and $a,b\in\mathbb{Z}$ such that $y_n=an+b$ for all $n\geq N$.
\end{definition}
\begin{definition}\label{def2}
An increasing sequence of integers $\left(z_n\right)_{n\geq1}$ is termed \textit{quasi-linear} if it is not eventually-linear but, for any $n\in\mathbb{N}$, it is the case that either $z_{n+1}-z_n=a$ or $z_{n+1}-z_n=a+1$ for some fixed $a\in\mathbb{N}$.
\end{definition}
\noindent It is worth noting here that the quasi-linear sequences we shall encounter do in fact satisfy slightly more stringent conditions than those given in Definition \ref{def2}. This will be highlighted and explained in Section \ref{sectionscenario3}. In evaluating the expectations we obtain both exact and asymptotic formulas.
\section{The expectation as a finite sum}
Let the random variable $X_{m,n}$ denote the number of children born such that at least $m$ boys and $n$ girls result. Observe that over the first $m+n-1$ children, it is not possible to have $m$ boys and $n$ girls, so $X_{m,n} > m+n-1$. Let the random variable $Y$ represent the number of boys born out of the first $m+n-1$ children. It is clear that
\begin{equation}\label{simpleprob}
\mathbb{P}[Y=k]=\frac{1}{2^{m+n-1}}{{m+n-1} \choose k}.
\end{equation}
We obtain an expression for the expectation of $X_{m,n}$, denoted $\mathbb{E}\left[X_{m,n}\right]$, by conditioning on $Y$. To this end, consider the random variable $\mathbb{E}\left[X_{m,n}\mid Y=k\right]$. if there are $k$ boys, the remaining $l=(m+n-1)-k$ children are girls. Now, for $kN$. Therefore, when $n>N$, it is the case that
\begin{align*}
f(m,n)
&< \frac{2m}{2^{m+n-2}}\sum_{k=0}^{m-1}g_{k,m}(n)\\
&\leq \frac{2m^2 g_{m-1,m}(n)}{2^{m+n-2}}\\
&=\frac{m^2}{2^{m-3}}\cdot\frac{g_{m-1,m}(n)}{2^n}.
\end{align*}
Since the numerator and denominator of the fraction on the right exhibit polynomial and exponential growth, respectively, it follows that $f(m,n)\rightarrow 0$ as $n\rightarrow \infty$.
Thus, for any fixed $m\in\mathbb{N}$, $2n$ is an asymptotic formula for $\mathbb{E}\left[X_{m,n}\right]$ in a particularly strong sense. Not only is it true that
\[\frac{\mathbb{E}\left[X_{m,n}\right]}{2n}\rightarrow 1\quad\text{as}\quad n\rightarrow \infty,\]
but also $\mathbb{E}\left[X_{m,n}\right]-2n\rightarrow 0$. With $\langle x\rangle$ denoting the nearest integer to $x$, we are able to deduce from this that $\big(\left\langle\mathbb{E}\left[X_{m,n}\right]\right\rangle\big)_{n\geq 1}$ is an eventually-linear sequence, a result that may in fact be argued heuristically. As an example, the first 14 terms of $\big(\left\langle\mathbb{E}[X_{5,n}]\right\rangle\big)_{n\geq 1}$ are given by
\[10, 10, 11, 11, 12, 14, 15, 17, 19, 20, 22, 24, 26, 28,\ldots,\]
and is linear from the $10$th term onwards.
Furthermore, by considering the expression $f(m,n)$, it may be shown that, for fixed $m$,
\[\mathbb{E}\left[X_{m,n}\right]=2n+\frac{h(n)}{2^{n+m-2}(m-1)!},\]
where $h(n)$ is a monic polynomial of degree $m-1$. By way of an example,
\[\mathbb{E}\left[X_{5,n}\right]=2n+\frac{n^4+22n^3+203n^2+950n+1920}{2^{n+3}4!}.\]
\section{Scenario 2}
Now let $a$ and $b$ be fixed positive integers such that, without loss of generality, $a**0$ we have the following result \cite{wikihoeffding}:
\begin{equation}\label{hoeff}
\mathbb{P}[S\leq n(p-\epsilon)]\leq \exp\left(-2\epsilon^2 n\right).
\end{equation}
On setting $p=\frac{1}{2}$, $n(p-\epsilon)=ra-1$ and $n=r(a+b)-1$, we obtain
\[\epsilon=\frac{r(b-a)+1}{2\big(r(a+b)-1\big)}\]
(noting that $b>a$, so that $\epsilon$ is indeed positive). Thus, from (\ref{hoeff}),
\begin{align*}
\mathbb{P}[S\leq ra-1]
&\leq \exp\left(-2\left(\frac{r(b-a)+1}{2\big(r(a+b)-1\big)}\right)^2 \big(r(a+b)-1\big)\right)\\
&=\exp\left(-\frac{\big(r(b-a)+1\big)^2}{2\big(r(a+b)-1\big)}\right).
\end{align*}
Since, for fixed $a$ and $b$,
\[ra \exp\left(-\frac{\big(r(b-a)+1\big)^2}{2\big(r(a+b)-1\big)}\right)\rightarrow 0\quad\text{as}\quad r\rightarrow\infty,\]
it may be seen, from (\ref{tailbin}), that when $m$ and $n$ remain in proportion we have both
\[\frac{\mathbb{E}[X_{ra,rb}]}{2rb}\rightarrow 1\quad\text{and}\quad\mathbb{E}[X_{ra,rb}]-2rb\rightarrow 0\quad\text{as}\quad r\rightarrow \infty,\]
emphasizing here that the assumption $a**