Skip to main content
Log in

Bayesian Ying-Yang system, best harmony learning, and five action circling

  • Research Article
  • Published:
Frontiers of Electrical and Electronic Engineering in China

Abstract

Firstly proposed in 1995 and systematically developed in the past decade, Bayesian Ying-Yang learning1) is a statistical approach for a two pathway featured intelligent system via two complementary Bayesian representations of a joint distribution on the external observation X and its inner representation R, which can be understood from a perspective of the ancient Ying-Yang philosophy. We have q(X,R) = q(X|R)q(R) as Ying that is primary, with its structure designed according to tasks of the system, and p(X,R) = p(R|X)p(X) as Yang that is secondary, with p(X) given by samples of X while the structure of p(R|X) designed from Ying according to a Ying-Yang variety preservation principle, i.e., p(R|X) is designed as a functional with q(X|R), q(R) as its arguments. We call this pair Bayesian Ying-Yang (BYY) system. A Ying-Yang best harmony principle is proposed for learning all the unknowns in the system, in help of an implementation featured by a five action circling under the name of A5 paradigm. Interestingly, it coincides with the famous ancient WuXing theory that provides a general guide to keep the A5 circling well balanced towards a Ying-Yang best harmony. This BYY learning provides not only a general framework that accommodates typical learning approaches from a unified perspective but also a new road that leads to improved model selection criteria, Ying-Yang alternative learning with automatic model selection, as well as coordinated implementation of Ying based model selection and Yang based learning regularization. This paper aims at an introduction of BYY learning in a twofold purpose. On one hand, we introduce fundamentals of BYY learning, including system design principles of least redundancy versus variety preservation, global learning principles of Ying-Yang harmony versus Ying-Yang matching, and local updating mechanisms of rival penalized competitive learning (RPCL) versus maximum a posteriori (MAP) competitive learning, as well as learning regularization by data smoothing and induced bias cancelation (IBC) priori. Also, we introduce basic implementing techniques, including apex approximation, primal gradient flow, Ying-Yang alternation, and Sheng-Ke-Cheng-Hui law. On the other hand, we provide a tutorial on learning algorithms for a number of typical learning tasks, including Gaussian mixture, factor analysis (FA) with independent Gaussian, binary, and non-Gaussian factors, local FA, temporal FA (TFA), hidden Markov model (HMM), hierarchical BYY, three layer networks, mixture of experts, radial basis functions (RBFs), subspace based functions (SBFs). This tutorial aims at introducing BYY learning algorithms in a comparison with typical algorithms, particularly with a benchmark of the expectation maximization (EM) algorithm for the maximum likelihood. These algorithms are summarized in a unified Ying-Yang alternation procedure with major parts in a same expression while differences simply characterized by few options in some subroutines. Additionally, a new insight is provided on the ancient Chinese philosophy of Yin-Yang and WuXing from a perspective of information science and intelligent system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Duda R O, Hart P E, Stork D G. Pattern Classification. 2nd ed. New York: John Wiley & Sons, 2001

    MATH  Google Scholar 

  2. Xu L. Machine learning problems from optimization perspective. Journal of Global Optimization, 2010, 47: 369–401

    Article  MATH  Google Scholar 

  3. Xu L. Bayesian Ying Yang learning. Scholarpedia, 2007, 2(3): 1809 http://scholarpedia.org/article/Bayesian Ying_Yang learning

    Article  Google Scholar 

  4. Aster R, Borchers B, Thurber C. Parameter Estimation and Inverse Problems. New York: Elsevier Academic Press, 2004

    Google Scholar 

  5. Brown R G, Hwang P Y C. Introduction to Random Signals and Applied Kalman Filtering. 3rd ed. New York: John Wiley & Sons, 1997

    MATH  Google Scholar 

  6. Narendra K S, Parthasarathy K. Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1990, 1(1): 4–27

    Article  Google Scholar 

  7. Redner R A, Walker H F. Mixture densities, maximum likelihood, and the EM algorithm. SIAM Review, 1984, 26(2): 195–239

    Article  MATH  MathSciNet  Google Scholar 

  8. Xu L, Jordan M I. On convergence properties of the EM algorithm for Gaussian mixtures. Neural Computation, 1996, 8(1): 129–151

    Article  Google Scholar 

  9. Anderson T W, Rubin H. Statistical inference in factor analysis. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability. 1956, 5: 111–150

    MathSciNet  Google Scholar 

  10. Rubi D, Thayer D. EM algorithm for ML factor analysis. Psychometrika, 1976, 57: 69–76

    Google Scholar 

  11. Bozdogan H, Ramirez D E. FACAIC: Model selection algorithm for the orthogonal factor model using AIC and FACAIC. Psychometrika, 1988, 53(3): 407–415

    Article  MATH  Google Scholar 

  12. Burnham K P, Anderson D R. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. 2nd ed. New York: Springer, 2002

    MATH  Google Scholar 

  13. Tikhonov A N, Arsenin V Y. Solutions of Ill-Posed Problems. Washington: Winston and Sons, 1977

    MATH  Google Scholar 

  14. Poggio T, Girosi F. Networks for approximation and learning. Proceedings of the IEEE, 1990, 78(9): 1481–1497

    Article  Google Scholar 

  15. Amari S I, Cichocki A, Yang H. A new learning algorithm for blind separation of sources. In: Touretzky D S, Mozer M C, Hasselmo M E, eds. Advances in Neural Information Processing System 8. Cambridge: MIT Press, 1996, 757–763

    Google Scholar 

  16. Bell A J, Sejnowski T J. An information-maximization approach to blind separation and blind deconvolution. Neural Computation, 1995, 7(6): 1129–1159

    Article  Google Scholar 

  17. Xu L. Independent component analysis and extensions with noise and time: A Bayesian Ying-Yang learning perspective. Neural Information Processing - Letters and Reviews, 2003, 1(1): 1–52

    Google Scholar 

  18. Xu L. Independent subspaces. In: Ramón J, Dopico R, Dorado J, Pazos A, eds. Encyclopedia of Artificial Intelligence, Hershey(PA): IGI Global. 2008, 903–912

    Google Scholar 

  19. Xu L. Least mean square error reconstruction principle for self-organizing neural-nets. Neural Networks, 1993, 6(5): 627–648

    Article  Google Scholar 

  20. McLachlan G J, Krishnan T. The EM Algorithms and Extensions. New York: John Wiley & Sons, 1997

    Google Scholar 

  21. Dempster A P, Laird N M, Rubin D B. Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 1977, 39(1): 1–38

    MATH  MathSciNet  Google Scholar 

  22. Amari S. Information geometry of the EM and EM algorithms for neural networks. Neural Networks, 1995, 8(9): 1379–1408

    Article  Google Scholar 

  23. Grenander U, Miller M. Pattern theory: From representation to inference. Oxford: Oxford University Press, 2007

    MATH  Google Scholar 

  24. Mumford D. On the computational architecture of the neocortex II: The role of cortico-cortical loops. Biological Cybernetics, 1992, 66(3): 241–251

    Article  Google Scholar 

  25. Friston K. A theory of cortical responses. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 2005, 360(1456): 815–836

    Article  Google Scholar 

  26. Yuille A L, Kersten D. Vision as Bayesian inference: Analysis by synthesis? Trends in Cognitive Sciences, 2006, 10(7): 301–308

    Article  Google Scholar 

  27. Schwarz G. Estimating the dimension of a model. Annals of Statistics, 1978, 6(2): 461–464

    Article  MATH  MathSciNet  Google Scholar 

  28. Rissanen J. Modeling by shortest data description. Automatica, 1978, 14: 465–471

    Article  MATH  Google Scholar 

  29. Rissanen J. Information and Complexity in Statistical Modeling. New York: Springer, 2007

    MATH  Google Scholar 

  30. DeGroot M H. Optimal Statistical Decisions. Hooken: Wiley Classics Library, 2004

    Book  MATH  Google Scholar 

  31. Mackay D J C. A practical Bayesian framework for backpropagation networks. Neural Computation, 1992, 4(3): 448–472

    Article  Google Scholar 

  32. MacKay D. Information Theory, Inference, and Learning Algorithms. Cambridge: Cambridge University Press, 2003

    MATH  Google Scholar 

  33. Wallace C S, Boulton D M. An information measure for classification. Computer Journal, 1968, 11(2): 185–194

    MATH  Google Scholar 

  34. Wallace C S, Dowe D R. Minimum message length and Kolmogorov complexity. Computer Journal, 1999, 42(4): 270–280

    Article  MATH  Google Scholar 

  35. Bourlard H, Kamp Y. Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics, 1988, 59: 291–294

    Article  MATH  MathSciNet  Google Scholar 

  36. Palmieri F, Zhu J, Chang C. Anti-Hebbian learning in topologically constrained linear networks: A tutorial. IEEE Transactions on Neural Networks, 1993, 4(5): 748–761

    Article  Google Scholar 

  37. Grossberg S, Carpenter G A. Adaptive resonance theory. In: Arbib M A, ed. The Handbook of Brain Theory and Neural Networks. 2nd ed. Cambridge: MIT Press, 2002, 87–90

    Google Scholar 

  38. Carpenter G A, Grossberg S. A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing, 1987, 37: 54–115

    Article  Google Scholar 

  39. Kawato M. Cerebellum and motor control. In: Arbib M A, ed. The Handbook of Brain Theory and Neural Networks. 2nd ed. Cambridge: MIT Press, 2002, 190–195

    Google Scholar 

  40. Shidara M, Kawano K, Gomi H, Kawato M. Inverse-dynamics model eye movement control by Purkinje cells in the cerebellum. Nature, 1993, 365(6441): 50–52

    Article  Google Scholar 

  41. Wolpert D, Kawato M. Multiple paired forward and inverse models for motor control. Neural Networks, 1998, 11(7–8): 1317–1329

    Article  Google Scholar 

  42. Hinton G E, Dayan P, Frey B J, Neal R N. The wake-sleep algorithm for unsupervised learning neural networks. Science, 1995, 268(5214): 1158–1160

    Article  Google Scholar 

  43. Dayan P, Hinton G E, Neal R M, Zemel R S. The Helmholtz machine. Neural Computation, 1995, 7(5): 889–904

    Article  Google Scholar 

  44. Jaakkola T S. Tutorial on variational approximation methods. In: Opper M, Saad D, eds. Advanced Mean Field Methods: Theory and Practice. Cambridge: MIT press, 2001, 129–160

    Google Scholar 

  45. Jordan M, Ghahramani Z, Jaakkola T, Saul L. Introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183–233

    Article  MATH  Google Scholar 

  46. Corduneanu A, Bishop CM. Variational Bayesian model selection for mixture distributions. In: Jaakkola T, Richardson T, eds. Proceedings of the Eighth International Conference on Artificial Intelligence and Statistics. 2001, 27–34

  47. Xu L. Bayesian-Kullback coupled YING-YANG machines: Unified learning and new results on vector quantization. In: Proceedings of the International Conference on Neural Information Processing. 1995, 977–988 (A further version in NIPS8. In: Touretzky D S, et al. eds. Cambridge: MIT Press, 444–450)

  48. Xu L. Ying-Yang learning. In: Arbib M A, ed. The Handbook of Brain Theory and Neural Networks. 2nd ed. Cambridge: MIT Press, 2002, 1231–1237

    Google Scholar 

  49. Xu L. Advances on BYY harmony learning: Information theoretic perspective, generalized projection geometry, and independent factor auto-determination. IEEE Transactions on Neural Networks, 2004, 15(4): 885–902

    Article  Google Scholar 

  50. Xu L. Learning algorithms for RBF functions and subspace based functions. In: Olivas E, et al. eds. Handbook of Research on Machine Learning, Applications and Trends: Algorithms, Methods and Techniques. Hershey(PA): IGI Global, 2009, 60–94

    Google Scholar 

  51. Xu L. Bayesian Ying Yang system, best harmony learning, and Gaussian manifold based family. In: Zurada et al. eds. Computational Intelligence: Research Frontiers, WCCI2008 Plenary/Invited Lectures. Lecture Notes in Computer Science, 2008, 5050: 48–78

  52. Xu L, Oja E. Randomized Hough transform. In: Ramón J, Dopico R, Dorado J, Pazos A, eds. Encyclopedia of Artificial Intelligence. Hershey(PA): IGI Global, 2008, 1354–1361

    Google Scholar 

  53. Veith I. The Yellow Emperor’s Classic of Internal Medicine. Berkeley: University of California Press, 1972

    Google Scholar 

  54. Vapnik, V. Estimation of Dependences Based on Empirical Data. Springer, 2006

  55. Stone M. Cross-validation: A review. Mathematics, Operations and Statistics, 1978, 9(1): 127–140

    MATH  Google Scholar 

  56. Rivals I, Personnaz L. On cross validation for model selection. Neural Computation, 1999, 11(4): 863–870

    Article  Google Scholar 

  57. Akaike H. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 1974, 19(6): 714–723

    Google Scholar 

  58. Bozdogan H. Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extension. Psychometrika, 1987, 52(3): 345–370

    Article  MATH  MathSciNet  Google Scholar 

  59. Cavanaugh J E. Unifying the derivations for the Akaike and corrected Akaike information criteria. Statistics & Probability Letters, 1997, 33(2): 201–208

    Article  MATH  MathSciNet  Google Scholar 

  60. Williams P M. Bayesian regularization and pruning using a Laplace prior. Neural Computation, 1995, 7(1): 117–143

    Article  Google Scholar 

  61. Tibshirani R, Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 1996, 58(1): 267–288

    MATH  MathSciNet  Google Scholar 

  62. MacKay D J C. Bayesian interpolation. Neural Computation, 1992, 4(3): 415–447

    Article  Google Scholar 

  63. Salah A A, Alpaydin E. Incremental mixtures of factor analyzers. In: Proceedings the 17th International Conference on Pattern Recognition. 2004, 1: 276–279

    Google Scholar 

  64. Xu L, Krzyzak A, Oja E. Rival penalized competitive learning for clustering analysis, RBF net and curve detection. IEEE Transactions on Neural Networks, 1993, 4(4): 636–649

    Article  Google Scholar 

  65. Xu L, Krzyzak A, Oja E. Unsupervised and supervised classifications by rival penalized competitive learning. In: Proceedings of the 11th International Conference on Pattern Recognition. 1992, I: 672–675

    Google Scholar 

  66. Xu L. Rival penalized competitive learning. Scholarpedia, 2007, 2(8): 1810 http://www.scholarpedia.org/article/Rival penalized competitive learning

    Article  Google Scholar 

  67. Corduneanu A, Bishop C M. Variational Bayesian model selection for mixture distributions. In: Richardson T, Jaakkola T, eds. Proceedings of the Eighth International Conference on Artificial Intelligence and Statistics. 2001, 27–34

  68. McGrory C A, Titterington D M. Variational approximations in Bayesian model selection for finite mixture distributions. Computational Statistics & Data Analysis, 2007, 51(11): 5352–5367

    Article  MATH  MathSciNet  Google Scholar 

  69. Tu S, Xu L. A study of several model selection criteria for determining the number of signals. In: Proceedings of 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. 2010, 1966–1969

  70. Xu L. Fundamentals, challenges, and advances of statistical learning for knowledge discovery and problem solving: A BYY harmony perspective, keynote talk. In: Proceedings of the International Conference on Neural Networks and Brain. 2005, 1: 24–55

    Google Scholar 

  71. Hinton G E, Zemel R S. Autoencoders, minimum description length and Helmholtz free energy. In: Cowan J D, Tesauro G, Alspector J, eds. Advances in Neural Information Processing Systems 6. San Mateo: Morgan Kaufmann, 1994, 449–455

    Google Scholar 

  72. Xu L. Data smoothing regularization, multi-sets-learning, and problem solving strategies. Neural Networks, 2003, 16(5–6): 817–825

    Article  Google Scholar 

  73. Xu L. Bayesian Ying Yang system and theory as a unified statistical learning approach: (I) Unsupervised and semi-unsupervised learning. In: Amari S, Kassabov N, eds. Brain-like Computing and Intelligent Information Systems. Springer-Verlag, 1997, 241–274

  74. Xu L. Bayesian Ying Yang system and theory as a unified statistical learning approach: (II) From unsupervised learning to supervised learning and temporal modeling and (III) Models and algorithms for dependence reduction, data dimension reduction, ICA and supervised learning. In: Wong K M, King I, Yeung D Y, eds. Proceedings of Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective. 1997: 25–60

  75. Xu L. Bayesian Ying Yang system and theory as a unified statistical learning approach (VII): Data smoothing. In: Proceedings of the International Conference on Neural Information Processing. 1998, 1: 243–248

    Google Scholar 

  76. Bishop C M. Training with noise is equivalent to Tikhonov regularization. Neural Computation, 1995, 7(1): 108–116

    Article  MathSciNet  Google Scholar 

  77. Xu L. A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving. Pattern Recognition, 2007, 40(8): 2129–2153

    Article  MATH  Google Scholar 

  78. Xu L, Oja E, Kultanen P. A new curve detection method randomized Hough transform (RHT). Pattern Recognition Letters, 1990, 11(5): 331–338

    Article  MATH  Google Scholar 

  79. Hough P V C. Method and means for recognizing complex patterns. US Patent, 3069654, 1962-12-18

  80. Xu L. Best harmony, unified RPCL and automated model selection for unsupervised and supervised learning on Gaussian mixtures, ME-RBF models and three-layer nets. International Journal of Neural Systems, 2001, 11(1): 3–69

    Google Scholar 

  81. Xu L. Bayesian Ying-Yang learning theory for data dimension reduction and determination. Journal of Computational Intelligence in Finance, 1998, 6(5): 6–18

    Google Scholar 

  82. Tu S, Xu L. Theoretical analysis and comparison of several criteria on linear model dimension reduction. In: Adali T, Jutten C, Romano J M T, Barros A K, eds. Independent Component Analysis and Signal Separation. Lecture Notes in Computer Science, 2009, 5441: 154–162

  83. Xu L. BYY harmony learning, independent state space and generalized APT financial analyses. IEEE Transactions on Neural Networks, 2001, 12(4): 822–849

    Article  Google Scholar 

  84. Xu L. Temporal BYY encoding, Markovian state spaces, and space dimension determination. IEEE Transactions on Neural Networks, 2004, 15(5): 1276–1295

    Article  Google Scholar 

  85. Kalman R E. A new approach to linear filtering and prediction problems. Transactions of the ASME Journal of Basic Engineering, 1960, 35–45

  86. Sun K, Tu S, Gao D Y, Xu L. Canonical dual approach to binary factor analysis. In: Adali T, Jutten C, Romano J M T, Barros A K, eds. Independent Component Analysis and Signal Separation. Lecture Notes in Computer Science, 2009, 5441: 346–353

  87. Nathan S. Science and medicine in imperial China - The state of the field. The Journal of Asian Studies, 1988, 47(1): 41–90

    Article  Google Scholar 

  88. Wilhelm R, Baynes C. The I Ching or Book of Changes, with Foreword by Carl Jung. 3rd ed. Bollingen Series XIX. Princeton: Princeton University Press, 1967

    Google Scholar 

  89. Hansen C. A Daoist Theory of Chinese Thought: A Philosophical Interpretation. New York: Oxford University Press, 2000

    Google Scholar 

  90. Shilov G E, Gurevich B L. Integral, Measure, and Derivative: A Unified Approach. Silverman R trans. New York: Dover Publications, 1978

    Google Scholar 

  91. Ali S M, Silvey S D. A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society: Series B, 1966, 28(1): 131–140

    MATH  MathSciNet  Google Scholar 

  92. Kullback S, Leibler R A. On information and sufficiency. Annals of Mathematical Statistics, 1951, 22(1): 79–86

    Article  MATH  MathSciNet  Google Scholar 

  93. Shore J. Minimum cross-entropy spectral analysis. IEEE Transactions on Acoustics, Speech and Signal Process, 1981, 29(2): 230–237

    Article  MATH  MathSciNet  Google Scholar 

  94. Burg J P, Luenberger D G, Wenger D L. Estimation of structured covariance matrices. Proceedings of the IEEE, 1982, 70(9): 963–974

    Article  Google Scholar 

  95. Jaynes E T. Information theory and statistical mechanics. Physical Review, 1957, 106(4): 620–630

    Article  MathSciNet  Google Scholar 

  96. Xu L. Temporal BYY learning for state space approach, hidden Markov model and blind source separation. IEEE Transactions on Signal Processing, 2000, 48(7): 2132–2144

    Article  MATH  MathSciNet  Google Scholar 

  97. Jeffreys H. An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences, 1946, 186(1007): 453–461

    Article  MATH  MathSciNet  Google Scholar 

  98. Xu L. BYY learning system and theory for parameter estimation, data smoothing based regularization and model selection. Neural, Parallel and Scientific Computations, 2000, 8(1): 55–82

    MATH  MathSciNet  Google Scholar 

  99. Xu L. BYY Σ-Π factor systems and harmony learning. Invited talk. In: Proceedings of International Conference on Neural Information Processing (ICONIP’2000). 2000, 1: 548–558

    Google Scholar 

  100. Xu L. Bayesian Ying Yang learning. In: Zhong N, Liu J, eds. Intelligent Technologies for Information Analysis. Berlin: Springer, 2004, 615–706

    Google Scholar 

  101. Barron A, Rissanen J, Yu B. The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory, 1998, 44(6): 2743–2760

    Article  MATH  MathSciNet  Google Scholar 

  102. Xu L, Amari S. Combining classifiers and learning mixtureof-experts. In: Ramón J, Dopico R, Dorado J, Pazos A, eds. Encyclopedia of Artificial Intelligence. Hershey(PA): IGI Global, 2008, 318–326

    Google Scholar 

  103. Xu L. BYY learning, regularized implementation, and model selection on modular networks with one hidden layer of binary units. Neurocomputing, 2003, 51: 277–301 (Errata on Neurocomputing, 2003, 55(1–2): 405–406)

    Article  Google Scholar 

  104. Gales M J F, Young S. The application of hidden Markov models in speech recognition. Foundations and Trends in Signal Processing, 2008, 1(3): 195–304

    Article  Google Scholar 

  105. Su D, Wu X H, Xu L. GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection. In: Proceedings of 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. 2010, 4890–4893

  106. Rosti A V, Gales M. Factor analysed hidden Markov models for speech recognition. Computer Speech and Language, 2004, 18(2): 181–200

    Article  Google Scholar 

  107. Gales M J F. Discriminative models for speech recognition. In: Proceedings of Information Theory and Applications Workshop. 2007, 170–176

  108. Woodland P C, Povey D. Large scale discriminative training of hidden Markov models for speech recognition. Computer Speech and Language, 2002, 16(1): 25–47

    Article  Google Scholar 

  109. Csiszár I, Tusnády G. Information geometry and alternating minimization procedures. Statistics and Decisions, 1984, (Suppl. 1): 205–237

  110. Xu L, Oja E, Suen C Y. Modified Hebbian learning for curve and surface fitting. Neural Networks, 1992, 5(3): 441–457

    Article  Google Scholar 

  111. Xu L, Krzyzak A, Oja E. A neural net for dual subspace pattern recognition methods. International Journal of Neural Systems, 1991, 2(3): 169–184

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Xu.

Additional information

Lei Xu is a chair professor of The Chinese University of Hong Kong (CUHK), a Chang Jiang Chair Professor of Peking University, a guest Research Fellow of Institute of Biophysics, Chinese Academy of Sciences, an honorary Professor of Xidian University. He graduated from Harbin Institute of Technology by the end of 1981, and completed his master and Ph.D thesis at Tsinghua University during 1982–1986. Then, he joined Department Mathematics, Peking University in 1987 first as a postdoc and then exceptionally promoted to associate professor in 1988 and to a full professor in 1992. During 1989–1993, he worked at several universities in Finland, Canada and USA, including Harvard and MIT. He joined CUHK in 1993 as senior lecturer, became professor in 1996 and took the current position since 2002. Prof. Xu has published dozens of journal papers and also many papers in conference proceedings and edited books, covering the areas of statistical learning, neural networks, and pattern recognition, with a number of well-cited papers, e.g., his papers got over 3200 citations according to SCI-Expended (SCI-E) and over 5500 citations according to Google Scholar (GS), and over 2000 (SCI-E) and 3600 (GS) for his 10 most frequently cited papers. He served as associate editor for several journals, including Neural Networks (1995-present) and IEEE Transactions on Neural Networks (1994–1998), and as general chair or program committee chair of a number of international conferences. Moreover, Prof. Xu has served on governing board of International Neural Networks Society (INNS) (2001–2003), INNS Award Committee (2002–2003), and Fellow Committee of IEEE Computational Intelligence Society (2006, 2008), chair of Computational Finance Technical Committee of IEEE Computational Intelligence Society (2001–2003), and a past president of Asian-Pacific Neural Networks Assembly (APNNA). He has also served as an engineering panel member of Hong Kong RGC Research Committee (2001–2006), a selection committee member of Chinese NSFC/HK RGC Joint Research Scheme (2002–2005), external expert for Chinese NSFC Information Science (IS) Panel (2004–2006, 2008), external expert for Chinese NSFC IS Panel for distinguished young scholars (2009–2010), and an nominator for the prestigious Kyoto Prize (2003, 2007). Prof. Xu has received several Chinese national academic awards (including 1993 National Nature Science Award) and international awards (including 1995 INNS Leadership Award and the 2006 APNNA Outstanding Achievement Award). He has been elected to an IEEE Fellow (2001-) and a Fellow of International Association for Pattern Recognition (2002-), and a member of European Academy of Sciences (2002-).

About this article

Cite this article

Xu, L. Bayesian Ying-Yang system, best harmony learning, and five action circling. Front. Electr. Electron. Eng. China 5, 281–328 (2010). https://doi.org/10.1007/s11460-010-0108-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11460-010-0108-9

Keywords

Navigation