curve-fitting problem the problem of making predictions from past observations by fitting curves to the data. Curve fitting has two steps: first, select a family of curves; then, find the bestfitting curve by some statistical criterion such as the method of least squares (e.g., choose the curve that has the least sum of squared deviations between the curve and data). The method was first proposed by Adrian Marie Legendre (1752–1833) and Carl Friedrich Gauss (1777– 1855) in the early nineteenth century as a way of inferring planetary trajectories from noisy data. More generally, curve fitting may be used to construct low-level empirical generalizations. For example, suppose that the ideal gas law, P % nkT, is chosen as the form of the law governing the dependence of the pressure P on the equilibrium temperature T of a fixed volume of gas, where n is the molecular number per unit volume and k is Boltzmann’s constant (a universal constant equal to 1.3804 $ 10†16 erg°C†1. When the parameter nk is adjustable, the law specifies a family of curves – one for each numerical value of the parameter. Curve fitting may be used to determine the best-fitting member of the family, thereby effecting a measurement of the theoretical parameter, nk.
The philosophically vexing problem is how to justify the initial choice of the form of the law. On the one hand, one might choose a very large, complex family of curves, which would ensure excellent fit with any data set. The problem with this option is that the best-fitting curve may overfit the data. If too much attention is paid to the random elements of the data, then the predictively useful trends and regularities will be missed. If it looks too good to be true, it probably is. On the other hand, simpler families run a greater risk of making grossly false assumptions about the true form of the law. Intuitively, the solution is to choose a simple family of curves that maintains a reasonable degree of fit. The simplicity of a family of curves is measured by the paucity of parameters. The problem is to say how and why such a trade-off between simplicity and goodness of fit should be made.
When a theory can accommodate recalcitrant data only by the ad hoc – i.e., improperly motivated – addition of new terms and parameters, students of science have long felt that the subsequent increase in the degree of fit should not count in the theory’s favor, and such additions are sometimes called ad hoc hypotheses. The best-known example of this sort of ad hoc hypothesizing is the addition of epicycles upon epicycles in the planetary astronomies of Ptolemy and Copernicus. This is an example in which a gain in fit need not compensate for the loss of simplicity.
Contemporary philosophers sometimes formulate the curve-fitting problem differently. They often assume that there is no noise in the data, and speak of the problem of choosing among different curves that fit the data exactly. Then the problem is to choose the simplest curve from among all those curves that pass through every data point. The problem is that there is no universally accepted way of defining the simplicity of single curves. No matter how the problem is formulated, it is widely agreed that simplicity should play some role in theory choice. Rationalists have championed the curve-fitting problem as exemplifying the underdetermination of theory from data and the need to make a priori assumptions about the simplicity of nature. Those philosophers who think that we have no such a priori knowledge still need to account for the relevance of simplicity to science. Whewell described curve fitting as the colligation of facts in the quantitative sciences, and the agreement in the measured parameters (coefficients) obtained by different colligations of facts as the consilience of inductions. Different colligations of facts (say on the same gas at different volume or for other gases) may yield good agreement among independently measured values of parameters (like the molecular density of the gas and Boltzmann’s constant). By identifying different parameters found to agree, we constrain the form of the law without appealing to a priori knowledge (good news for empiricism). But the accompanying increase in unification also worsens the overall degree of fit. Thus, there is also the problem of how and why we should trade off unification with total degree of fit. Statisticians often refer to a family of hypotheses as a model. A rapidly growing literature in statistics on model selection has not yet produced any universally accepted formula for trading off simplicity with degree of fit. However, there is wide agreement among statisticians that the paucity of parameters is the appropriate way of measuring simplicity. See also EXPLANATION, PHILOSOPHY OF SCIENCE , WHEWEL. M.R.F.