Scientists Present New Solution to Imbalanced Learning Problem
Specialists at the HSE Faculty of Computer Science and Sber AI Lab have developed a geometric oversampling technique known as Simplicial SMOTE. Tests on various datasets have shown that it significantly improves classification performance. This technique is particularly valuable in scenarios where rare cases are crucial, such as fraud detection or the diagnosis of rare diseases. The study's results are available on ArXiv.org, an open-access archive, and will be presented at the International Conference on Knowledge Discovery and Data Mining (KDD) in summer 2025 in Toronto, Canada.
The problem of imbalanced learning is becoming increasingly relevant across various fields, including banking and medicine. Conventional methods, such as random oversampling, often generate low-quality samples or fail to accurately model rare class data.
Simplicial SMOTE (Synthetic Minority Oversampling Technique), a novel solution proposed by scientists from HSE University and Sber AI Lab, addresses these issues by enabling more accurate modelling of complex topological data structures and improving classifier performance on imbalanced datasets.
It generates new examples of a rare class by leveraging information from multiple closed instances ('simplex'), rather than just two close points, as in the original SMOTE and its well-known modifications. This facilitates a better understanding of the data and advances performance. The technique improves training on imbalanced data, where one class (eg, normal transactions) has many examples, while another class (eg, fraud) has few.
Researchers have experimentally shown on a large number of test datasets that the proposed approach achieves significantly better performance metrics, such as the F1 Score and Matthews Correlation Coefficient, for both the basic SMOTE and its modifications. In particular, an improvement was observed in gradient boosting, a classifier commonly used in practice.
'Our technique is particularly effective for tasks involving imbalanced data, where the rare class holds greater significance. Banks can use Simplicial SMOTE to detect fraud more effectively, and medical centres can apply it to diagnose rare diseases,' says Andrey Savchenko, co-author of the article and Leading Research Fellow at the Laboratories for Theoretical Modelling in AI of the HSE AI and Digital Science Institute.
The new technique can be integrated into existing oversampling algorithms (such as Borderline-SMOTE, Safe-level-SMOTE, and ADASYN), enabling better accuracy without significantly increasing computational complexity. According to the researchers, the developed approach could contribute to the creation of more accurate and reliable machine learning models, thereby improving the quality of analytics.
The study was conducted with support from the HSE Basic Research Programme.
See also:
Cerium Glows Yellow: Chemists Discover How to Control Luminescence of Rare Earth Elements
Researchers at HSE University and the Institute of Petrochemical Synthesis of the Russian Academy of Sciences have discovered a way to control both the colour and brightness of the glow emitted by rare earth elements. Their luminescence is generally predictable—for example, cerium typically emits light in the ultraviolet range. However, the scientists have demonstrated that this can be altered. They created a chemical environment in which a cerium ion began to emit a yellow glow. The findings could contribute to the development of new light sources, displays, and lasers. The study has been published in Optical Materials.
Genetic Prediction of Cancer Recurrence: Scientists Verify Reliability of Computer Models
In biomedical research, machine learning algorithms are often used to analyse data—for instance, to predict cancer recurrence. However, it is not always clear whether these algorithms are detecting meaningful patterns or merely fitting random noise in the data. Scientists from HSE University, IBCh RAS, and Moscow State University have developed a test that makes it possible to determine this distinction. It could become an important tool for verifying the reliability of algorithms in medicine and biology. The study has been published on arXiv.
Artificial Intelligence as a Catalyst for Sustainable Development
Artificial intelligence is transforming every aspect of life, expanding both our capabilities and our boundaries. At the same time, it presents new challenges for humanity, including concerns about safety, ethics, and environmental sustainability. Today, each neural network leaves a significant carbon footprint. However, with responsible management, AI has the potential to benefit the planet and become a cornerstone of a sustainable future economy. Panos Pardalos, Academic Supervisor of the Laboratory of Algorithms and Technologies for Network Analysis at the HSE Campus in Nizhny Novgorod, emphasised this point as he addressed the XXV Yasin (April) International Academic Conference on Economic and Social Development.
HSE Develops Its Own MLOps Platform
HSE researchers have developed an MLOps platform called SmartMLOps. It has been created for artificial intelligence researchers who wish to transform their invention into a fully-fledged service. In the future, the platform may host AI assistants to simplify educational processes, provide medical support, offer consultations, and solve a wide range of other tasks. Creators of AI technologies will be able to obtain a ready-to-use service within just a few hours. Utilising HSE’s supercomputer, the service can be launched in just a few clicks.
Habits Stem from Childhood: School Years Found to Shape Leisure Preferences in Adulthood
Moving to a big city does not necessarily lead to dramatic changes in daily habits. A study conducted at HSE University found that leisure preferences in adulthood are largely shaped during childhood and are influenced by where individuals spent their school years. This conclusion was drawn by Sergey Korotaev, Research Fellow at the HSE Faculty of Economic Sciences, from analysing the leisure habits of more than 5,000 Russians.
Russian Scientists Reconstruct Dynamics of Brain Neuron Model Using Neural Network
Researchers from HSE University in Nizhny Novgorod have shown that a neural network can reconstruct the dynamics of a brain neuron model using just a single set of measurements, such as recordings of its electrical activity. The developed neural network was trained to reconstruct the system's full dynamics and predict its behaviour under changing conditions. This method enables the investigation of complex biological processes, even when not all necessary measurements are available. The study has been published in Chaos, Solitons & Fractals.
Researchers Uncover Specific Aspects of Story Comprehension in Young Children
For the first time, psycholinguists from the HSE Centre for Language and Brain, in collaboration with colleagues from the USA and Germany, recorded eye movements during a test to assess narrative skills in young children and adults. The researchers found that story comprehension depends on plot structure, and that children aged five to six tend to struggle with questions about protagonists' internal states. The study findings have been published in the Journal of Experimental Child Psychology.
Scientists Propose Novel Theory on Origin of Genetic Code
Alan Herbert, Scientific Supervisor of the HSE International Laboratory of Bioinformatics, has put forward a new explanation for one of biology's enduring mysteries—the origin of the genetic code. According to his publication in Biology Letters, the contemporary genetic code may have originated from self-organising molecular complexes known as ‘tinkers.’ The author presents this novel hypothesis based on an analysis of secondary DNA structures using the AlphaFold 3 neural network.
See, Feel, and Understand: HSE Researchers to Explore Mechanisms of Movement Perception in Autism
Scientists at the HSE Cognitive Health and Intelligence Centre have won a grant from the Russian Science Foundation (RSF) to investigate the mechanisms of visual motion perception in autism. The researchers will design an experimental paradigm to explore the relationship between visual attention and motor skills in individuals with autism spectrum disorders. This will provide insight into the neurocognitive mechanisms underlying social interaction difficulties in autism and help identify strategies for compensating for them.
Scholars Disprove Existence of ‘Crisis of Trust’ in Science
An international team of researchers, including specialists from HSE University, has conducted a large-scale survey in 68 countries on the subject of trust in science. In most countries, people continue to highly value the work of scientists and want to see them take a more active role in public life. The results have been published in Nature Human Behaviour.