We study the least squares regression function estimator over the class of real-valued functions on [0,1]^d that are increasing in each coordinate.  For uniformly bounded signals and with a fixed, cubic lattice design, we establish that the estimator achieves the minimax rate of order n^{-min(2/(d+2),1/d)} in the empirical L_2 loss, up to poly-logarithmic factors.  Further, we prove a sharp oracle inequality, which reveals in particular that when the true regression function is piecewise constant on $k$ hyperrectangles, the least squares estimator enjoys a faster, adaptive rate of convergence of (k/n)^{min(1,2/d)}, again up to poly-logarithmic factors.  Previous results are confined to the case d≤2.  Finally, we establish corresponding bounds (which are new even in the case d=2) in the more challenging random design setting.  There are two surprising features of these results: first, they demonstrate that it is possible for a global empirical risk minimisation procedure to be rate optimal up to poly-logarithmic factors even when the corresponding entropy integral for the function class diverges rapidly; second, they indicate that the adaptation rate for shape-constrained estimators can be strictly worse than the parametric rate.