When checking my website logs to see if there were any entries indicating it had been "crawled", i.e., indexed, by DuckDuckGo, I found that there were no log entries for any of the IP addresses used by the DuckDuckGoBot for indexing webpages for 2013 nor for 2014. I found at DuckDuckGo's Sources webpage that though the search engine has its own web crawler, it relies heavily on indexes produced by the web crawlers for other search engines stating:
DuckDuckGo gets its results from over one hundred sources, including DuckDuckBot (our own crawler), crowd-sourced sites (like Wikipedia, which are stored in our own index), Yahoo! (through BOSS), Yandex, WolframAlpha, and Bing.
DuckDuckGo's page states they apply their own algorithm to rank results obtained from other search engines upon which they rely for data.
One of the search engines mentioned was Yandex. The Yandex search engine, Yandex Search, can be accessed at www.yandex.com. According to the Wikipedia articles for Yandex and Yandex Search the company operates the largest search engine in Russia with about 60% market share in Russia with its search engine generating 64% of all Russian web search traffic in 2010. The article on the company also states:
Yandex ranked as the 4th largest search engine worldwide, based on information from Comscore.com, with more than 150 million searches per day as of April 2012, and more than 50.5 million visitors (all company's services) daily as of February 2013.
The article also indicates Yandex is heavily utilized in Ukraine and Kazakhstan, providing nearly a 1/3 of all search results in those countries and 43% of all search results in Belarus.
When I searched the logs for this year for this website, I found quite a few entries indicating the site had been indexed by the Yandex web crawler. I.e., there were many entries containing the following:
"Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
In the homepage for this site, I include PHP code to notify me whenever Google's Googlebot indexes the site, so I updated that code to include a check that will lead to an email alert being sent to me whenever the YandexBot indicates the site, also.
<?php
$email = "me@example.com";
if( eregi("googlebot", $_SERVER['HTTP_USER_AGENT']) )
{
mail($email, "Googlebot Alert",
"Google just indexed your following page: " .
$_SERVER['REQUEST_URI']);
}
if( eregi("YandexBot", $_SERVER['HTTP_USER_AGENT']) )
{
mail($email, "Yandex Alert",
"Yandex just indexed your following page: " .
$_SERVER['REQUEST_URI']);
}
?>