Interesting People mailing list archives

Fwd: How Do You Vote? 50 Million Google Images Give a Clueip


From: "Dave Farber" <dave () farber net>
Date: Tue, 02 Jan 2018 21:25:00 +0000

---------- Forwarded message ---------
From: Gordon Peterson <gep2 () terabites com>
Date: Tue, Jan 2, 2018 at 4:19 PM
Subject: Re: [IP] How Do You Vote? 50 Million Google Images Give a Clue
To: <dave () farber net>


[for IP if you like]

Interesting stuff, but don't overlook the simpler solutions.

For example, here in Texas (at least) a lot of this demographic data is
matter of public record (available either free, or at least relatively
inexpensively).

Among the information available are:

1)  Election results data reports (precinct-by-precinct results, (overall
results, not identifying individual voter choices!)... including
straight-party voting, individual candidate voting, early voting, mailin
ballots, etc etc.)... interesting for example how voting drops off as
voters get to the less familiar names further down the ballots.  You can
also identify how active individual precincts' voters are, what the party
voting is likely to be by precinct, and so forth.

2)  property tax appraisal databases (for individual addresses... who owns
the property, what its taxable value is, how it's appraised,
commercial/residential/retail, etc):  thus can identify owners versus
renters, etc.

3)  voter registration databases (individual voters, active/suspense, their
age, where they live (address/precinct), how long ago they registered to
vote, which elections they've voted in
(primaries(parties!)/general/municipal/runoffs/etc, whether they voted
early/mailin/in person...), how politically active they are... (And if
you've archives of these old databases... you can compare them to identify
name changes, moves, who lives with who, marriages and separations, and
more...).  You can identify things like probably kids living with their
parents, married couples living with a parent/mother-in-law, etc etc.

I've written SPITBOL (!) programs to do a lot of these kinds of analysis...
for example to produce targeted mailing labels (and eliminating wasteful
duplicate mailings to the same household), precinct-by-precinct statistics,
how politically consistent and predictable a given precinct/neighborhood
tends to be, targeted block-walking lists for political candidates, etc etc.


On 1/2/2018 1:45 PM, Dave Farber wrote:

Begin forwarded message:

*From:* Dewayne Hendricks <dewayne () warpspeed com>
*Date:* January 2, 2018 at 12:34:54 PM EST
*To:* Multiple recipients of Dewayne-Net <dewayne-net () warpspeed com>
*Subject:* *[Dewayne-Net] How Do You Vote? 50 Million Google Images Give a
Clue*
*Reply-To:* dewayne-net () warpspeed com

[Note:  This item comes from friend Judi Clark.  DLH]

How Do You Vote? 50 Million Google Images Give a Clue
By STEVE LOHR
Dec 31 2017
<https://www.nytimes.com/2017/12/31/technology/google-images-voters.html>

What vehicle is most strongly associated with Republican voting districts?
Extended-cab pickup trucks. For Democratic districts? Sedans.

Those conclusions may not be particularly surprising. After all, market
researchers and political analysts have studied such things for decades.

But what is surprising is how researchers working on an ambitious project
based at Stanford University reached those conclusions: by analyzing 50
million images and location data from Google Street View, the street-scene
feature of the online giant’s mapping service.

For the first time, helped by recent advances in artificial intelligence,
researchers are able to analyze large quantities of images, pulling out
data that can be sorted and mined to predict things like income, political
leanings and buying habits. In the Stanford study, computers collected
details about cars in the millions of images it processed, including makes
and models.

“All of a sudden we can do the same kind of analysis on images that we have
been able to do on text,” said Erez Lieberman Aiden, a computer scientist
who heads a genomic research center at the Baylor School of Medicine. He
provided advice on one aspect of the Stanford project.

For computers, as for humans, reading and observation are two distinct ways
to understand the world, Mr. Lieberman Aiden said. In that sense, he said,
“computers don’t have one hand tied behind their backs anymore.”

Text has been easier for A.I. to handle, because words have discrete
characters — 26 letters, in the case of English. That makes it much closer
to the natural language of computers than the freehand chaos of imagery.
But image recognition technology, much of it developed by major technology
companies, has improved greatly in recent years.

The Stanford project gives a glimpse at the potential. By pulling the
vehicles’ makes, models and years from the images, and then linking that
information with other data sources, the project was able to predict
factors like pollution and voting patterns at the neighborhood level.

“This kind of social analysis using image data is a new tool to draw
insights,” said Timnit Gebru, who led the Stanford research effort. The
research has been published in stages, the most recent in late November in
the Proceedings of the National Academy of Sciences.

In the end, the car-image project involved 50 million images of street
scenes gathered from Google Street View. In them, 22 million cars were
identified, and then classified into more than 2,600 categories like their
make and model, located in more than 3,000 ZIP codes and 39,000 voting
districts.

But first, a database curated by humans had to train the A.I. software to
understand the images.

The researchers recruited hundreds of people to pick out and classify cars
in a sample of millions of pictures. Some of the online contractors did
simple tasks like identifying the cars in images. Others were car experts
who knew nuances like the subtle difference in the taillights on the 2007
and 2008 Honda Accords.

“Collecting and labeling a large data set is the most painful thing you can
do in our field,” said Ms. Gebru, who received her Ph.D. from Stanford in
September and now works for Microsoft Research.

But without experiencing that data-wrangling work, she added, “you don’t
understand what is impeding progress in A.I. in the real world.”

[snip]

Dewayne-Net RSS Feed: http://dewaynenet.wordpress.com/feed/
Twitter: https://twitter.com/wa8dzp



-------------------------------------------
Archives: https://www.listbox.com/member/archive/247/=now
RSS Feed: https://www.listbox.com/member/archive/rss/247/18849915-ae8fa580
Modify Your Subscription: https://www.listbox.com/member/?member_id=18849915&id_secret=18849915-aa268125
Unsubscribe Now: 
https://www.listbox.com/unsubscribe/?member_id=18849915&id_secret=18849915-32545cb4&post_id=20180102162517:6BF3D2BE-F003-11E7-9DC3-BA24D3234E63
Powered by Listbox: http://www.listbox.com

Current thread: