https://searchenginewatch.com/2016/02/25/say-goodbye-to-google-14-alternative-search-engines/ Say goodbye to Google: 14 alternative search engines SEO 25 Feb 16 | Christopher Ratcliff 742 487 96 Well it’s been a big week for search, I think we can all agree. If you’re a regular Google user ( 65% of you globally ) then you’ll have noticed some changes, both good and bad. I won’t debate the merits of theseimprovements, we’ve done that already here: Google kills Right Hand Side Ads and here: Google launches Accelerated Mobile Pages , but there’s a definite feeling of vexation that appears to be coming to a head. Deep breath… As the paid search space increases in ‘top-heaviness’, as organic results get pushed further off the first SERP, as the Knowledge Graph scrapes more and more publisher content and continues to make it pointless to click through to a website, and as our longstanding feelings of unfairness over Google’s monopoly and tax balance become more acute, now more than ever we feel there should beanother, viable search engine alternative. There was a point not that long ago when you could easily divide people between those that used Google, Yahoo, Ask Jeeves and AltaVista. Now it’s got to the point where if you’re not using Google, you’re not really using the internet properly. Right now though maybe we should be paying more attention to the alternatives. Maybe our daily lives and, for some of us, careers shouldn’tneed to balance on the fickle algorithm changes of the world’s most valuable company. Let’s see what else is out there in the non-Google world.It’s not that scary, I promise. Although you may want to bring a coat. Please note: this is an update of an article published on SEW in May 2014, we felt like it needed sprucing up especially many of the listed engines (Blekko, Topsy) are no longer with us. Bing Microsoft’s search engine is the second most popular search engine in the world, with 15.8% of the search market. But why should you use Bing? Lifehacker has some great articles where they try to convince themselves as much as anyone else why Bing is a serious contender to Google. Plus points include: Bing’s video search is significantly better than Google’s, giving you a grid of large thumbnails that you can click on to play or preview if you hover over them. Bing often gives twice as many autocomplete suggestions than Google does. Bing can predict when airfares are about to go up or down if you’re searching for flights. Bing also has a feature where if you type linkfromdomain: it will highlight the best ranked outgoing links from that site, helping you figure out which other sites your chosen site links to the most. Also note that Bing powers Yahoo’s search engine. DuckDuckGo The key feature of DuckDuckGo is that it doesn’t retain its users’ data, so it won’t track you or manipulate results based on your behaviour. So if you’re particularly spooked by Google’s all-seeing, all-knowing eye, this might be the one for you. There’s lots more info on DuckDuckGo’s performance here . Quora As Google gets better and better at answering more complicated questions, it will never be able to match the personal touch available with Quora . Ask any question and its erudite community will offer their replies. Or you can choose from any similar queries previously asked. Dogpile Dogpile may look like a search engine you cobbled together with clip-art, but that’s rather the point as it pulls in and ‘curates’ results from various different engines including Google, Yandex and Yahoo, but removes all the ads. Vimeo Of course if you’re going to give up Google, then you’ll also have to give up YouTube, which can be a terrifying prospect. But there is an alternative. And a pretty good one at that… Vimeo . The professional’s choice of video-sharing site, which has lots of HD video and no ads. Yandex This is a Russian portal, offering many similar products and services as Google, and it’s the dominant search engine in Russia . As you can see it offers results in a nice logical format, replete with favicons so you can clearly see the various channels for your branded queries. Boardreader If you want to get into the nitty-gritty of a subject with a variety ofdifferent points of view away from the major publications, Boardreader surfaces results purely from forums, message boards and, of course, Reddit. SHARE THIS ARTICLE 742 487 96 RELATED ARTICLES Five common keyword research mistakes you need to avoid How to make speed a core part of your traffic and conversion strategy 12 video SEO tips to help improve your search rankings Understanding how users, not algorithms, search online will help your SEO WolframAlpha WolframAlpha is a ‘computational knowledge engine’, or super clever nerd to you and me. Ask it to calculate any data or ask it about any fact and it will give you the answer. Plus it does this awesome ‘computing’ thing while it thinks about your answer (which can take a short while.) It’s not always successful, you have to practice how to get the best from it. But at least it’s aware of the terrible 90s television show The Dinosaurs. IxQuick Another search engine that puts its users’ privacy at the forefront. With IxQuick none of your details are stored and no cookies are used. A user can set preferences, but they will be deleted after 90 days of inactivity. Ask.com Oh look… Ask Jeeves is still around. Also he’s no longer a Wodehousian butler, but a computer generated bank manager. Weird. It’s still a slightly mediocre search engine pretending to be a question and answer site, but the ‘Popular QA’ results found on the right hand side are very handy if Jeeves himself can’t satisfy your query. And what a good use of the right-hand side space, huh Google. SlideShare SlideShare is a really handy place to source information from presentations, slide decks, webinars and whatever else you may have missed from not attending a conference. You’ll also be surprised what information you can find there. Addict-o-matic “Inhale the web” with the friendly looking hoover guy by creating your own topic page, which you can bookmark and see results from a huge number of channels in that one page (including Google, Bing News, Twitter, YouTube, Flickr). Creative Commons Search CC Search is particularly handy if you need to find copyright free images for your website (as discussed in this post on image optimisation for SEO ). Just type your query in then click on your chosen site you want to search. Giphy Because really, when it comes down to it, we could imagine a worse dystopian future than one in which we all communicate entirely in Gifs . Christopher Ratcliff is the editor of Search Engine Watch.You can follow him on Twitter: @Christophe_Rock
我常用到的 stata 命令 最重要的两个命令莫过于 help 和 search 了。即使是经常使用 stata 的人也很难,也没必要记住常用命令的每一个细节,更不用说那些不常用到的了。所以,在遇到困难又没有免费专家咨 询时,使用 stata 自带的帮助文件就是最佳选择。 stata 的帮助文件十分详尽,面面俱到,这既是好处也是麻烦。当你看到长长的帮助文件时,是不是对迅 速找到相关信息感到没有信心? 闲话不说了。 help 和 search 都是查找帮助文件的命令,它们之间的区别在于 help 用于查找精确的命 令名,而 search 是模糊查找。如果你知道某个命令的名字,并且想知道它的具体使用方法,只须在 stata 的命令行窗口中输入 help 空格加上这个名 字。回车后结果屏幕上就会显示出这个命令的帮助文件的全部内容。如果你想知道在 stata 下做某个估计或某种计算,而不知道具体该如何实现,就需要用 search 命令了。使用的方法和 help 类似,只须把准确的命令名改成某个关键词。回车后结果窗口会给出所有和这个关键词相关的帮助文件名和链接列表。 在列表中寻找最相关的内容,点击后在弹出的查看窗口中会给出相关的帮助文件。耐心寻找,反复实验,通常可以较快地找到你需要的内容。 下面 该正式处理数据了。我的处理数据经验是最好能用 stata 的 do 文件编辑器记下你做过的工作。因为很少有一项实证研究能够一次完成,所以,当你下次继续工 作时。能够重复前面的工作是非常重要的。有时因为一些细小的不同,你会发现无法复制原先的结果了。这时如果有记录下以往工作的 do 文件将把你从地狱带到天 堂。因为你不必一遍又一遍地试图重现做过的工作。在 stata 窗口上部的工具栏中有个孤立的小按钮,把鼠标放上去会出现 “bring do-file editor to front” ,点击它就会出现 do 文件编辑器。 为了使 do 文件能够顺利工作,一般需要编辑 do 文件的 “ 头 ” 和 “ 尾 ” 。这里给出我使用的 “ 头 ” 和 “ 尾 ” 。 /* (标签。简单记下文件的使命。) */ capture clear (清空内存中的数据) capture log close (关闭所有打开的日志文件) set mem 128m (设置用于 stata 使用的内存容量) set more off (关闭 more 选项。如果打开该选项,那么结果分屏输出,即一次只输出一屏结果。你按空格键后再输出下一屏,直到全部输完。如果关闭则中间不停,一次全部输出。) set matsize 4000 (设置矩阵的最大阶数。我用的是不是太大了?) cd D: (进入数据所在的盘符和文件夹。和 dos 的命令行很相似。) log using (文件名) .log,replace (打开日志文件,并更新。日志文件将记录下所有文件运行后给出的结果,如果你修改了文件内容, replace 选项可以将其更新为最近运行的结果。) use (文件名) ,clear (打开数据文件。) (文件内容) log close (关闭日志文件。) exit,clear (退出并清空内存中的数据。) 这个 do 文件的 “ 头尾 ” 并非我的发明,而是从沈明高老师那里学到的。版权归沈明高老师。(待续) 我常用到的 stata 命令: (续) 实证工作中往往接触的是原始数 据。这些数据没有经过整理,有一些错漏和不统一的地方。比如,对某个变量的缺失观察值,有时会用点,有时会用 -9 , -99 等来表示。回归时如果使用这些观 察,往往得出非常错误的结果。还有,在不同的数据文件中,相同变量有时使用的变量名不同,会给合并数据造成麻烦。因此,拿到原始数据后,往往需要根据需要 重新生成新的数据库,并且只使用这个新库处理数据。这部分工作不难,但是非常基础。因为如果在这里你不够小心,后面的事情往往会白做。 假 设你清楚地知道所需的变量,现在要做的是检查数据、生成必要的数据并形成数据库供将来使用。检查数据的重要命令包括 codebook , su , ta , des 和 list 。其中, codebook 提供的信息最全面,缺点是不能使用 if 条件限制范围,所以,有时还要用别的帮帮忙。 su 空格加变量名报告相应变量的非 缺失的观察个数,均值,标准差,最小值和最大值。 ta 空格后面加一个(或两个)变量名是报告某个变量(或两个变量二维)的取值(不含缺失值)的频数,比率 和按大小排列的累积比率。 des 后面可以加任意个变量名,只要数据中有。它报告变量的存储的类型,显示的格式和标签。标签中一般记录这个变量的定义和单 位。 list 报告变量的观察值,可以用 if 或 in 来限制范围。所有这些命令都可以后面不加任何变量名,报告的结果是正在使用的数据库中的所有变量的相应信 息。说起来苍白无力,打开 stata 亲自实验一下吧。 顺带说点儿题外话。除了 codebook 之外,上述统计类的命令都属于 r 族命令(又 称一般命令)。执行后都可以使用 return list 报告储存在 r ()中的统计结果。最典型的 r 族命令当属 summarize 。它会把样本量、均值、标准差、方差、最小值、最大值、总和等统计信息储 存起来。你在执行 su 之后,只需敲入 return list 就可以得到所有这些信息。其实,和一般命令的 return 命令类似,估计命令(又称 e 族命令)也有 ereturn 命令,具有报告,储存信息的功 能。在更复杂的编程中,比如对回归分解,计算一些程序中无法直接计算的统计量,这些功能更是必不可少。 检查数据时,先用 codebook 看一下它的值域和单位。如果有 -9 , -99 这样的取值,查一下问卷中对缺失值的记录方法。确定它们是缺失值后,改为用点记录。命令是 replace ( 变量名 )=. if ( 变量名 )==-9 。再看一下用点记录的缺失值有多少,作为选用变量的一个依据。 得到可用的数据后,我会给没有标 签的变量加上注解。或者统一标签;或者统一变量的命名规则。更改变量名的命令是 ren (原变量名)空格(新变量名)。定义标签的命令是 label var (变量名)空格 ” (标签内容) ” 。整齐划一的变量名有助于记忆,简明的标签有助于明确变量的单位等信息。 如果你需要使用通过原始 变量派生出的新变量,那么就需要了解 gen , egen 和 replace 这三个命令。 gen 和 replace 常常在一起使用。它们的基本语法是 gen ( 或 replace) 空格(变量名)=(表达式)。二者的不同之处在于 gen 是生成新变量, replace 是重新定义旧变量。 虚拟变量是 我们常常需要用到的一类派生变量。如果你需要生成的虚拟变量个数不多,可以有两种方法生成。一种是简明方法: gen 空格(变量名)=((限制条件)) 。如果某个观察满足限制条件,那么它的这个虚拟变量取值为 1 ,否 则为 0 。另一种要麻烦一点。就是 gen (变量名)= 1 if (取值为一限制条件) replace (相同的变量名)= 0 if (取值为零的限制条件) 两 个方法貌似一样,但有一个小小的区别。如果限制条件中使用的变量都没有任何缺失值,那么两种方法的结果一样。如果有缺失值,第一种方法会把是缺失值的观察 的虚拟变量都定义为 0 。而第二种方法可以将虚拟变量的取值分为三种,一是等于 1 ,二是等于 0 ,三是等于缺失值。这样就避免了把本来信息不明的观察错误地纳 入到回归中去。下次再讲如何方便地生成成百上千个虚拟变量。 我常用到的 stata 命令: (续) 大量的虚拟变量往往是根据某个已知变量的取值生成的。比如,在某个回归中希望控制每个观察所在的社区,即希望控制标记社区的虚拟变量。社区数目可能有成百上千个,如果用上次的所说的方法生成就需要重复成百上千次,这也太笨了。大量生成虚拟变量的命令如下; ta (变量名) , gen( (变量名) ) 第一个括号里的变量名是已知的变量,在上面的例子中是社区编码。后一个括号里的变量名是新生成的虚拟变量的共同前缀,后面跟数字表示不同的虚拟变量。如果我在这里填入 d ,那么,上述命令就会新生成 d1 , d2 ,等等,直到所有社区都有一个虚拟变量。 在回归中控制社区变量,只需简单地放入这些变量即可。一个麻烦是虚拟变量太多,怎么简单地加入呢?一个办法是用省略符号, d* 表示所有 d 字母开头的变量,另一法是用破折号, d1-d150 表示第一个到第 150 个社区虚拟变量(假设共有 150 个社区)。 还有一种方法可以在回归中直接控制虚拟变量,而无需真的去生成这些虚拟变量。使用命令 areg 可以做到,它的语法是 areg (被解释变量) (解释变量) , absorb (变量名) absorb 选项后面的变量名和前面讲的命令中第一个变量名相同。在上面的例子中即为社区编码。回归的结果和在 reg 中直接加入相应的虚拟变量相同。 生成变量的最后一招是 egen 。 egen 和 gen 都用于生成新变量,但 egen 的特点是它更强大的函数功能。 gen 可以支持一些函数, egen 支持额外的函数。如果用 gen 搞不定,就得用 egen 想办法了。不过我比较懒,到现在为止只用用取平均、加和这些简单的函数。 有的时候数据情况复杂一些,往往生成所需变量不是非常直接,就需要多几个过程。曾经碰到原始数据中记录日期有些怪异的格式。比如, 1991 年 10 月 23 日被记录为 19911023 。我想使用它年份和月份,并生成虚拟变量。下面是我的做法: gen yr=int(date) gen mo=int((data-yr*10000)/100) ta yr, gen( yd) ta mo, gen( md) 假 设你已经生成了所有需要的变量,现在最重要的就是保存好你的工作。使用的命令是 save 空格(文件名), replace 。和前面介绍的一样, replace 选项将更新你对数据库的修改,所以一定要小心使用。最好另存一个新的数据库,如果把原始库改了又变不回去,就叫天不应叫地不灵了。 我常用到的 stata 命令 (续) 前面说的都是对单个数据库的简单操 作,但有时我们需要改变数据的结构,或者抽取来自不同数据库的信息,因此需要更方便的命令。这一类命令中我用过的有:改变数据的纵横结构的命令 reshape ,生成退化的数据库 collapse ,合并数据库的命令 append 和 merge 。 纵列( longitudinal )数据 通常包括同一个行为者( agent )在不同时期的观察,所以处理这类数据常常需要把数据库从宽表变成长表,或者相反。所谓宽表是以每个行为者为一个观察, 不同时期的变量都记录在这个观察下,例如,行为者是厂商,时期有 2000 、 2001 年,变量是雇佣人数和所在城市,假设雇佣人数在不同时期不同,所在城市 则不变。宽表记录的格式是每个厂商是一个观察,没有时期变量,雇佣人数有两个变量,分别记录 2000 年和 2001 年的人数,所在城市只有一个变量。所谓长 表是行为者和时期共同定义观察,在上面的例子中,每个厂商有两个观察,有时期变量,雇佣人数和所在城市都只有一个,它们和时期变量共同定义相应时期的变量 取值。 在上面的例子下,把宽表变成长表的命令格式如下: reshape long (雇佣人数的变量名) , i( (标记厂商的变量名) ) j( (标记时期的变量名) ) 因为所在城市不随时期变化,所以在转换格式时不用放在 reshape long 后面,转换前后也不改变什么。相反地,如果把长表变成宽表则使用如下命令 reshape wide (雇佣人数的变量名) , i( (标记厂商的变量名) ) j( (标记时期的变量名) ) 唯一的区别是 long 换成了 wide 。 collapse 的用处是计算某个数据库的一些统计量,再把它存为只含有这些统计量的数据库。用到这个命令的机会不多,我使用它是因为它可以计算中位数和从 1 到 99 的百分位数,这些统计量在常规的数据描述命令中没有。如果要计算中位数,其命令的语法如下 collapse (median) ( (变量名) ), by( (变量名) ) 生成的新数据库中记录了第一个括号中的变量(可以是多个变量)的中位数。右面的 by 选项是根据某个变量分组计算中位数,没有这个选项则计算全部样本的中位数。 合 并数据库有两种方式,一种是增加观察,另一种是增加变量。第一种用 append ,用在两个数据库的格式一样,但观察不一样,只需用 append 空格 using 空格(文件名)就可以狗尾续貂了。简单明了,不会有什么错。另一种就不同了,需要格外小心。如果两个数据库中包含共同的观察,但是变量不同,希 望从一个数据库中提取一些变量到另一个数据库中用 merge 。完整的命令如下: use (文件名) sort (变量名) save (文件名) , replace use (文件名) sort (变量名) merge (变量名) using (文件名) , keep( (变量名) ) ta _merge drop if _merge==2 drop merge save (文件名) , replace 我常用到的 stata 命令 (续) 讲到这里似乎对于数据的生 成和处理应该闭嘴了。大家可能更想听听估计、检验这些事情。但我并不想就此止住,因为实际中总是有一些简单套用命令无法轻易办到的特殊要求。此时至少有两 条路可以通向罗马:一是找到更高级的命令一步到位;二是利用已知简单命令多绕几个圈子达到目的。 下面讲一个令我刻骨铭心的经历,这也是迄 今我所碰到的生成新数据中最繁复的了。原始数据中包含了可以识别属于同一个家庭中所有个人的信息和家庭成员与户主关系的信息。目的是利用这些信息建立亲子 关系。初步的构想是新数据库以子辈为观察,找到他们的父母,把父母的变量添加到每个观察上。我的做法如下: use a1,clear keep if gender==2agemos=96a8~=1line10 replace a5=1 if a5==0 keep if a5==1|a5==3|a5==7 ren h hf ren line lf sort wave hhid save b1,replace keep if a5f==1 save b2,replace use b1,clear keep if a5f==3|a5f==7 save b3,replace use a3,clear sort wave hhid merge wave hhid using CHNS01b2, keep(hf lf) ta _merge drop if _merge==2 sort hhid line wave by hhid line wave: egen x=count(id) drop x _merge save b4,replace use a4,clear sort wave hhid merge wave hhid using CHNS01b3, keep(a5f a8f schf a12f hf agemosf c8f lf) ta _merge drop if _merge==2 sort hhid line wave by hhid line wave: egen x=count(id) gen a=agemosf-agemos drop if a216x==3 gen xx=x gen xxx=x gen y=lf if x==1 replace y=lf if x==2xx==1 replace y=lf if x==2xxx==1 keep if x==1|(lf==yx==2) drop a x xx xxx y _merge save b5,replace log close exit,clear
Malaysian Remote Sensing Agency (MRSA) received new satellite images from France that were taken on March 23. The images showed 122 potential objects in one area of the ocean. Some of the objects were as much as 23 meters in length. Some appeared bright, possibly indicating solid material. They were located about 2,500 kilometers from Perth. "This is another new lead that will help direct the search operation," said Acting Minister of Transportation Hishammuddin Bin Hussein on Wednesday. Flight 370 search resumes; families remain in limbo (CNN) -- Cheng Li Ping is afraid to tell her sons their father might never come home. "My heart can't handle it. I don't want to hurt my children," the Chinese woman told CNN Wednesday as she waited in Kuala Lumpur for evidence about what happened to her husband and the 238 others who were aboard Malaysia Airlines Flight 370. Cheng says she cannot bring herself to accept that her husband is dead, even after authorities announced there were no survivors. "I can't trust the Malaysian government. I can't work now because all I can think about is my husband and my children," she told CNN's Sara Sidner in Kuala Lumpur. "I don't have strength. ... My head is a mess."
On Sunday, eight airplanes will fly over the southern Indian Ocean searching for missing Malaysia Airlines Flight 370, said Australian Maritime Safety Authority spokeswoman Andrea Hayward-Maher. That's two planes more than Saturday and the most aircraft involved in the search lead by Australia so far, she said.Sunday's search will be a visual search, AMSA rescue spokesman Mike Barton told reporters. Eyes will take precedence over radar. The planes will base their movements on Chinese satellite images of debris and drift modeling, the AMSA said.
(CNN) -- Satellite imagery that may show debris from Malaysia Airlines Flight 370 is raising hopes that investigators can narrow what has been a needle-in-a-haystack search operation. The images, obtained and analyzed by the Australian Geospatial-Intelligence Organisation as "a possible indication of debris south of the search area that has been the focus of the search operation," according to the Australian Maritime Safety Authority, were taken above a remote part of ocean thousands of kilometers south-east of Australia. Two objects, one of approximately 24 meters (78.7 ft) in length and another around five meters (16.4 ft) long have been spotted, leading to hopes that more information regarding the missing airliner has come to light.
General form, INDEX function: INDEX( source,excerpt ) where source specifies the character variable or expression to search excerpt specifies a character string that is enclosed in quotation marks (' '). 可以用来挑选包含字符串的数据集子集 data hrd.datapool; set hrd.temp; if index(job,'word processing') 0; run;