Wednesday, July 17, 2013

Braverman Take Two - When Sec. 230 Meets Data Mash-Up

At first, I thought Braverman v. YELP, INC., 2013 NY Slip Op 31407 - NY: Supreme Court 2013 was another roll-your-eyes, when-are-local-attorneys-going-to-learning-about-Sec.-230 case.  After all it fits the stereotypical mold: third-party patronizes merchant; third-party writes negative review on website; merchant sues website; website boots lawsuit based on Sec. 230.  The program for this case is not new; you've seen this drama play out through and through.

But the more I pondered this court decision, the more it elucidated a hidden conundrum.  The court in Roommate.com established that there is some mystic line that websites cross, moving from hosts of third-party content to becoming creators of content.  In Roommates.com, the website asked potentially discriminatory questions and required users to answer the questions as a condition of setting up profiles. In requiring users to create inappropriate content, the website's actions, the court concluded, rose above permissible minor edits or selecting which third-party content to publish. The website had become a producer of the questionable content and therefore was not immune from liability under Sec. 230.

Braverman involves Yelp, a dentist, and a negative review.  The dentist didn't like the review and therefore sued Yelp. Yelp whipped out its copy of United States Code, Title 47, and checked to see if Sec. 230 was still in it. It was. Therefore, Yelp filed that same motion that has been filed 1000s of times before by review sites: motion to dismiss for failure to state a cause of action, Rule 12(b)(6); a review site is not liable for third-party content. Third-party reviews are the responsibility of the author of the review and not the website. The Court agreed.

"But wait a minute," argued Plaintiff's attorney.  Yelp is not just some neutral actor here.  If you frequent Yelp, you might know that Yelp has filters with which Yelp selects some reviews to show and other reviews to hide (you can see them if you solve a CAPTCHA). This has led to consternation and controversy. Yelp's selection of which reviews to show can significantly alter the appearance of the merchant in question - and because of this, Yelp, Plaintiff's attorney argued, became a publisher of the negative review in question.

Nope, says the court.  Selecting which third-party content to publish is protected by Sec. 230: 
Yelp's alleged act of filtering out positive reviews does not make Yelp the creator or developer of the alleged defamatory reviews. Yelp's choice to publish certain reviews — whether positive or negative — is an exercise of a publisher's traditional editorial function protected by the CDA. Batzel v. Smith, 333 F.3d 1018, 1030 (9th Cir. 2003) (finding that it is an editorial function to "choose among proferred material"); Barnes v. Yahoo!, Inc., 570 F.3d 1096 (9th Cir. 2009) (finding that it is an editorial function to decide "whether to publish or to withdraw from publication third-party content"). Moreover, Section 230 does not distinguish between neutral and selective publishers in its grant of immunity. Shiamili, 17 N.Y.3d at 289.
Okay, but, wait.  I mean, come on.  Hear me out on this one.

Roommate.com established that Sec. 230 is not absolute - that there is a point where the line between the third-party and the website becomes blurred, and the website can no longer claim to have never met the third-party on the street. Yelp is playing a numbers game. By manipulating the numbers, Yelp changes review outcomes - controlling the flow of consumer business as consumers follow the top reviews. Given most sets of third-party reviews, Yelp can select certain reviews and make a merchant look marvelous or miserable. It isn’t the third parties creating that aggregated data representation of the merchant - it's Yelp through Yelp's manipulation of the data.

Let's go to the videotape.  At the time of this blog post (not at the time of the court case), the dentist in question had 30 reviews.  Three were visible; 27 were hidden.  The reviewers could give one to five stars. The visible average rating was "1." The average rating of all the reviews (hidden and unhidden) was "4.3;" the median rating of all reviews was "5." That's a rather significantly different average rating than the visible "1."  There were five ratings of "1;" one rating of "4;" and twenty-four ratings of "5."  Charted out, we get something that looks like this:



Does this dentist look like a "1"? Again, if we display the data as a donut chart, we get the following:

Why plot it out as a donut chart?  I like donuts!

Eighty percent of the reviews were "5;" 83% were "4" or above.  Only 17% of the reviews were a "1." And yet somehow, through Yelp's selection of certain reviews and hiding of others - Yelp manipulates the result.

Yelp isn’t presenting third-party content here; the third-party content - the data - says this dentist is pretty good. Yelp is manipulating data to produce new content. This isn’t content that the third-parties provided; this is content that Yelp created by its interaction with data.  And through this interaction, Yelp can turn any highly-rated merchant into a stink-pot, and any stinky merchant into king-of-the-hill. 

And it can do so with impunity according to this court.  This begs the question of whether we have here crossed that Roommates line between content host and content creator.

Convinced?  Okay, what if I were to tell you that the data itself was garbage.  As you may know, there are companies out there that now are in the business of "reputation management."  For a price, they will go out and ensure that the third-party reviews of the merchant are the reviews that the merchant paid for. They will take necessary steps to ensure that a stink-pot merchant looks like king-of-the-hill for a price.

Companies like Yelp struggle to keep the value of their review sites by weeding out bogus reviews. They weed out duplicates; sandbaggers; and content that lacks credibility.  They are constantly battling with "reputation management" services that have opposing agendas.

In the case in question, it appears, for whatever reason (and it may be entirely legitimate), that eight of the reviews are duplicates.  You can see at one point one of the reviewers states "I don't know why my reviews never show up…" and proceeds to write a duplicate five-star review. 

Was this a frustrated reviewer who kept getting his reviews hidden by Yelp's tactics, or was this Yelp struggling to purge the reviews of ratings that were less than reliable?  Who knows.  But what is true is that review sites that want their reviews to be valuable struggle with invalid reviews and merchants attempting to manipulate the system. And in fact, Sec. 230 was designed exactly for this purpose.  It was designed to protect those online hosts of third party content - who take some "editorial" like actions - in order to protect the quality of the content on their service and protect users from fraud and other malicious garbally gook. Congress did not want services that attempted to do good by policing their content to be transformed into publishers, liable for third party content.

So, um, where do we come out on this?

Simple answer: case dismissed; Yelp is not liable for third party reviews.

Complex answer: To understand data and big data, there is a difference between third party content and how that data is handled in the aggregate. Manipulation of data produces results that may not be true to or consistent with that third party data. Manipulation of data produces new content created by the manipulator, not by the third-parties. Who is responsible for the new content created through the manipulation of data?

"If this sentence is quoted from a third party, is the sentence mine or the third party's?" 

"If" "every" "word" "in" "this" "sentence" "is" "quoted" "from" "a" "third" "party," "is" "the" "sentence" "mine" "or" "the" "third" "party's?"

"I""f" "e""v""e""r""y"" "l""e""t""t""e""r""… oh you get the idea.

There is a point where we cross the magical Roommates.com line, where third-party content, aggregated, manipulated, and machinated by me, becomes new content.
Post a Comment