Interesting People mailing list archives
Re WSJ ends Google users' free ride, then falls 44% in search results
From: "Dave Farber" <farber () gmail com>
Date: Wed, 14 Jun 2017 21:41:15 -0400
Begin forwarded message:
From: Chuck McManis <chuck.mcmanis () gmail com> Date: June 14, 2017 at 8:21:27 PM EDT To: Dave Farber <dave () farber net> Subject: Re: [IP] Re WSJ ends Google users' free ride, then falls 44% in search results [For IP if you wish] Having worked at Google (although not in the Search group directly) and deployed and a competitive search engine (Blekko) and dealt directly with the issues of crawling and search, a number of things are important in this discussion; First (and perhaps foremost) there is a big question of business models, costs, and value. That ground is fairly well covered but to summarize web advertising generates significantly (as in orders of magnitude) less revenue than print advertising. Subscription models have always had 'leakage' where content was shared when a print copy was handed around (or lended in the case of libraries), content production costs (those costs that don't include printing and distribution of printed copies) have gone up, and information value (as a function of availability) has gone down. This is a fascinating (for me) economic system which is the subject of many learned papers but the summary impact in this discussion is that publications like the Wall Street journal are working hard to maximize the value extracted within the constraints of the web infrastructure. Second, there is a persistent tension between people who apply classical economics to the system and those who would like to produce a financially viable work product. And finally, there is a "Fraud Surface Area" component that is enabled by the new infrastructure that is relatively easily exploited without a concomitant level of risk to the perpetrators. So lets approach this from the fraud perspective first and directly answer the complaint, "The WSJ must have the world's worst web programmers if they can't figure out how to show Google the full articles even though normal users are paywalled." Google is a target for fraudsters because subverting its algorithm can enable advertising click fraud, remote system compromise, and identity theft. One way that arose early on in Google's history were sites that present something interesting when the Google Crawler came through reading the page, some something malicious when an individual came through. The choice of what to show in response to an HTTP protocol request was determined largely from meta-data associated with the connection such as "User Agent", "Source Address", "Protocol options", and "Optional headers." To combat this Google has developed a crawling infrastructure that will crawl a web page and then at a future date audit that page by fetching it from an address with metadata that would suggest a human viewer. When the contents of a page change based on whether or not it looks like a human connection, Google typically would immediately dump the page and penalize the domain in terms of its Page Rank (this moves the page into the later pages of results and so are less likely to clicked on by the general public). Google is also a company that doesn't generally like to put "exemptions" in for a particular domain. They have had issues in the past where an exemption was added and then the company went out of business and the domain acquired by a bad actor who subsequently exploited the exemption to expose users to malware laced web pages. As a result, (at least as of 2010 when I left) the policy was not to provide exceptions and not to create future problems when the circumstances around a specific exemption might no longer apply. As a result significant co-ordination between the web site and Google is required to support anything out of the ordinary, and that costs resources which Google is not willing to donate to solve the web site's problems. It is also important to note that both Google and the WSJ's are cognizant of the sales conversion opportunity associated with a reader *knowing* because of the snippet that some piece of information is present in the document, and then being denied access to that document for free. It connects the dots between "there is something here I want to know" and "you can pay me now and I'll give it to you." As a result, if Google were to continue to rank the WSJ article into the first page of results it would be providing a financial boost to the WSJ and yet not benefiting itself financially at all. The bottom line is, as it usually is, that there is a value here and the market maker is unwilling to cede all of it to the seller. Google has solved this problem with web shopping sites by telling them they have to pay Google a fee to appear in the first page of results, no doubt if the WSJ was willing to pay Google an ongoing maintenance fee Google would be willing to put the WSJ pages back into the first page of results (even without them being available if you clicked on them). As has been demonstrated in the many interactions between Google and the newspapers of the world, absent any externally applied regulation, there are three 'values' Google is willing to accept. You can give Google's customers free access to a page found on Google (the one click free rule) which Google values because it keeps Google at the top of everyone's first choice for searching for information. Alternatively you can allow only Google advertising on your pages which Google values because it can extract some revenue from the traffic they send your way. Or you can just pay Google for the opportunity to be in the set of results that the user sees first. --Chuck McManisOn Wed, Jun 14, 2017 at 4:09 PM, Dave Farber <farber () gmail com> wrote: Begin forwarded message:From: "John Levine" <johnl () iecc com> Date: June 14, 2017 at 6:37:48 PM EDT To: dave () farber net Cc: "Lauren Weinstein" <lauren () vortex com>, "Bcc" <johnl-sent () iecc com> Subject: Re: [IP] WSJ ends Google users' free ride, then falls 44% in search results In article <6D9A5574-7651-4048-B295-66085444E8F5 () gmail com> you write:After the Journal's free articles went behind a paywall, Google's bot only saw the first few paragraphs and started ranking them lower, limiting the Journal's viewership. Executives at the Journal, owned by Rupert Murdoch's News Corp., argue that Google's policy is unfairly punishing them for trying to attract more digital subscribers. They want Google to treat their articles equally in search rankings, despite being behind a paywall.The WSJ must have the world's worst web programmers if they can't figure out how to show Google the full articles even though normal users are paywalled. That's what all the other paywalled papers do. Sheesh. R's, John PS: If the argument were "but then people can get them from the Google cache" their progammers would be even worse than I thought.Archives | Modify Your Subscription | Unsubscribe Now
------------------------------------------- Archives: https://www.listbox.com/member/archive/247/=now RSS Feed: https://www.listbox.com/member/archive/rss/247/18849915-ae8fa580 Modify Your Subscription: https://www.listbox.com/member/?member_id=18849915&id_secret=18849915-aa268125 Unsubscribe Now: https://www.listbox.com/unsubscribe/?member_id=18849915&id_secret=18849915-32545cb4&post_id=20170614214124:BAAFB59E-516B-11E7-8CD3-905071F2199E Powered by Listbox: http://www.listbox.com
Current thread:
- Re WSJ ends Google users' free ride, then falls 44% in search results Dave Farber (Jun 14)
- <Possible follow-ups>
- Re WSJ ends Google users' free ride, then falls 44% in search results Dave Farber (Jun 14)
- Re WSJ ends Google users' free ride, then falls 44% in search results Dave Farber (Jun 14)