Development Of Genetic Programming Techniques For Water Industry Applications (1998-2001)

Funding body: Engineering and Physical Sciences Research Council (EPSRC)

The objective identified in the proposal was the investigation of potential applications for genetic programming (GP) that would be of benefit to the water industry. The proposal also stressed that the project would develop a novel algorithm and that the investigation would not simply consist of applying the existing genetic programming (Koza, 1992; 1994) algorithm to selected problems. The importance of comparing the new method with existing techniques was also identified. In recent years many methods for creating "black box" mathematical models have been reported in the engineering literature. The methods include artificial neural networks, polynomial networks and genetic programming.

Research literature has tended to emphasise the benefits of these new methods, particularly the ability to automatically create mathematical models without having to specify the form of an equation in advance, as many older regression methods require. However, the literature has tended to downplay one of the major disadvantages of the new methods, the inability to determine confidence limits on predictions. Measures of error or uncertainty are often critical for models used in engineering applications where the consequences of error may include damage to property or loss of life. After preliminary examination the original method of genetic programming (i.e., Koza-style GP and symbolic regression in particular) was found to be deficient in a number of aspects in addition to the inability to provide confidence limits. These include:

  • Resulting models are not necessarily smooth and can have peculiar discontinuities or spikes. These result from the use of conditionals (if ? then statements) and a mathematical exception used to force closure under division. To avoid division overflow errors the exception is made that division by zero produces zero rather than infinity.
  • Resulting models are often very complex and difficult to interpret. There is no comprehensive method to determine if models are overfit or underdetermined.

Solutions are generally not very good.

It was concluded that in the original form, genetic programming had been largely oversold and was not suitable for real-world civil engineering applications.

The first phase of the project was to develop an improved GP methodology by incorporating classical statistical methods for parameter optimisation into symbolic regression. Classical parameter optimisation would both improve the quality of solutions and allow for estimation of confidence limits. The development of a method that automatically generates (evolves) mathematical models that are amenable to statistical inference represents the best aspects of both approaches, provided that the resulting models are competitive in terms of accuracy with other methods. A new technique of this type would have much wider applications than water resources engineering alone. The technique could be used in any field where predictive or simulation models are used and minimising computational effort or the accurate assessment of uncertainty in predictions is critical.

References

  • Davidson, J. W., D. A. Savic and G. A. Walters (2001a) Prediction error in rainfall-runoff models part 1: Overfitting, Water Resources Research, in preparation.
  • Davidson, J. W., D. A. Savic and G. A. Walters (2001b) Prediction error in rainfall-runoff models part 2: Probability density function of error, Water Resources Research, in preparation.
  • Davidson, J. W., D. A. Savic and G. A. Walters (2000a) Symbolic and numerical regression: experiments and applications. Accepted for publication in Journal of Information Sciences.
  • Davidson, J. W., D. A. Savic and G. A. Walters (2000b) Rainfall Runoff Modelling Using a New Polynomial Regression Method. 4th International Conference on Hydroinformatics, University of Iowa, Iowa City, USA.
  • Davidson, J. W., D. A. Savic and G. A. Walters (2000c) Approximators for the Colebrook-White Formula Obtained through a Hybrid Regression Method. XIII International Conference on Computational Methods in Water Resources, University of Calgary, Calgary, Canada.
  • Davidson, J. W., D. A. Savic and G. A. Walters (2000d) Symbolic and numerical regression: experiments and applications. Proceedings of Recent Advances in Soft Computing 2000, De Montfort University, Leicester, 175-182.
  • Davidson, J. W., D. A. Savic and G. A. Walters (1999a) Symbolic and numerical regression: a hybrid technique for polynomial approximators. Proceedings of Recent Advances in Soft Computing ?99, De Montfort University, Leicester: 111-116.
  • Davidson, J. W., D. A. Savic and G. A. Walters (1999b) Method for the identification of explicit polynomial formulae for the friction in turbulent pipe flow. Journal of Hydroinformatics 1(2) 115-126.
  • Savic, D. A., G. A. Walters and J. W. Davidson (1999) A genetic programming approach to rainfall-runoff modelling, Water Resources Management, 13 (1999) 219-231.

Back to Artificial Intelligence research and applications

Google+