Post by shikharani00189 on Oct 31, 2024 8:53:47 GMT 2
Yandex is a powerful search engine in terms of artificial intelligence and machine learning. It is still trying to demonstrate this by announcing Korolev , a new semantic understanding algorithm . With Korolev , Yandex will be able to better analyze the semantic vectors that connect search queries and entities present in the search engine's index.
How does semantic search work on Yandex?
At the time of writing, Yandex has explained to off page seo service us how its semantic search method works and it's quite interesting. Several algorithms come into play as the work progresses.
The Matrixnet algorithm (maybe replaced by CatBoost ) is an equivalent of Google's RankBrain, it is introduced into the ranking algorithm to do machine learning and organize the results as logically as possible. However, it cannot be sufficient on its own, it requires prior work of semantic analysis to optimize the loading of results and the relationships between semantic vectors.
Yandex therefore first needs lists of "pre-appropriate" pages that respond to the words present in the queries. Here, it is the Palekh algorithm that currently comes into play. Introduced in 2016, the algorithm uses neural networks to better understand the results. However, the algorithm only converts search queries and page titles directly into numbers (values of the semantic vectors). Then, Matrixnet only has to compare the entities to offer an appropriate ranking.
What is the point of the Korolev algorithm?
Korolev is a largely boosted and improved Palekh. First difference, the new algorithm analyzes the pages entirely, and not only the titles . Consequently, all the contents are transformed into semantic vectors according to what the system recognizes (expressions, forms...). The problem with Korolev is the number of resources needed to transform all the contents into a semantic vector. Indeed, neural networks as well as machine learning imply a certain latency to understand all the entities of a text. On the other hand, Palekh is much less powerful but faster since it only analyzes the titles in depth.
Semantic analysis is one of the first steps in returning results. Indeed, it is first necessary to analyze the words, expressions and vectors to know which pages can claim the final display in the SERPs. Palekh acts directly and analyzes on average only 150 resources in order not to use too many resources . This means that the ranking algorithms use Palekh only for a tiny number of pages and queries, as in the early days of Google RankBrain (only 15% of queries originally, but 100% now).
The Korolev algorithm collects behavioral statistics based on what users searched for on a page (original query), time spent on the page, bounce rate, etc. This way, Korolev can determine if one page is better at answering a query than another. For example, if a user spends several minutes on a page after typing a particular query, it seems that the page is answering the request.
Korolev's strength is therefore to convert all content but also human behaviors relating to web pages . And the algorithm does this for more than 200,000 resources , far from the 150 documents analyzed by Palekh. However, as the use of neural networks is still resource-intensive, Korolev is applied upstream of searches, during indexing. The comparison of semantic vectors is therefore done from already known entities, and no longer in real time. Hence the importance of understanding queries and knowing as many semantic connections as possible in advance.