The Copy Editor

I'm Jojo Pasion Malig. I'm the usual suspect behind the night desk of the Philippines' leading news website. I like making interactive data eye candy. Mild prescriptivist.
Recent Tweets @jojomalig
Who I Follow
Posts tagged "Data"

tecnoetica:

These recipes may be most helpful to journalists who are trying to learn programming and already know the basics. If you’re already an experienced programmer, you might learn about a new library or tool you haven’t tried yet.

(via lifeandcode)

Most style guides and dictionaries have come to accept the use of the noun data with either singular or plural verbs, and we hereby join the majority.

WSJ style guide doyen Paul Martin answers if data are plural, or if it’s singular.

More good reads on the subject at The Guardian and The Economist.

sprinklersareevil:

…But I’d also experiment with some ideas that could break new ground in other ways. For advice on how to make these work, I’d pick the brains of some of the folks I’ve worked with at the Harvard Berkman Center for internet and society and in the broader technology world, as well as people in journalism. Examples (third parties have already created or are creating some of these):

• “Anonymice Tracker” – an open and publicly searchable database of every story quoting anonymous sources, annotated in various ways (for cross-tabular purposes) such as whether a story was based entirely on such sources, as well as quoting the reason(s) given for granting anonymity.

• “Feedback Zeitgeist” – semantic analyses, including visualizations, of correspondents’ email and forum/comment posts. I suspect this could be extraordinarily illuminating once we had some data baselines.

• “Error Notifier” – a system whereby anyone who signs up receives an automatic email notification (assuming he or she was actually looked at the original posting or column) of any error in what I’ve written.

• “PubEd Submitterator” – borrowing the second word from BoingBoing, my favorite blog, which relies on its readers for tips on what to show to the rest of us. The main purpose would be to get help finding the best critiques.

• “Goof Tracker” – a reader-fed database of what they believe are errors of fact and whether they’ve been corrected. I understand that the newsroom has its own non-public database, and I believe there should be some public listing of this kind. I also recognize the difficulties of making this work, but it’s worth a try.

You’ll have noticed that most of these ideas, as well as my must-do list above, share my conviction that the audience should be an integral part of this process. The readers and other constituencies should participate, not just read, by saying what they know and believe, and by adding data where we can create structured input systems.


Dan Gillmor proves his awesomeness once again. Some very interesting ideas about the role of audience, internet, and how they relate to modern online journalism are raised here.

And it is awesome that he is a fan of Boing Boing.

The average person receives 63,000 words of new information every day.

Barrett Sheridan, Is Cue the Cure for Information Overload?

The Copy Editor asks: That’s for the average person. Any ballpark figure on the quantity of information processed by journalists every day?

What should “Three Laws for Journalists” look like, based on Isaac Asimov’s Three Laws of Robotics? 

PBSMediaShift:

1. Digital systems must be designed to protect and ensure, to the fullest extent possible, personal data and its exchange and communication.

2. Journalists must pursue all stories deemed to be in the public interest, even where that may require challenging the security of digital systems.

3. Journalists must protect their sources as well as the innocent public to the same extent as the digital systems of the First Law, where it would otherwise render the impossibility of the Second Law.

futurejournalismproject:

Visualizing the 99%

The Guardian put together this animated explainer about wealth distribution in the United States.

Click through to see the data behind the animation.

If everything is shared automatically, nothing has significance.
Smart post addressing some of the downsides of “frictionless” sharing. (via arainert)

(via mediamediamedia)

onaissues:

Visualizing the Occupy Wall Street protests: 

Mother Jones has put together this Occupy Wall Street map they continue to update and they are asking readers to submit new locations or news stories associated with the protests. They also have an excellent run down of how Occupy Wall Street is utilizing social media, along with charts, stats and ongoing coverage. 

The Daily Kos has mapped over 200 Occupy Wall Street groups using Facebook. Find the map and full list here

The Atlantic has a powerful photo gallery of the protests beyond New York, spanning from LA to Boston. 

motherjones:

Have to keep changing the zoom level on our Occupy Wall Street map—news reports say protests have spread to Anchorage, Hilo, Hawaii, several cities in Canada, and now Melbourne.

reporter-arm:

The first Guardian data journalism: May 5, 1821

Ooof.

onaissues:

Explore large image collections with ImagePlot

“Existing visualization tools show data as points, lines, and bars. ImagePlot’s visualizations shows the actual images in your collection. The images can be scaled to any size and organized in any order - according to their dates, content, visual characteristics, etc. Because digital video is just a set of individual still images, you can also use ImagePlot to explore patterns in films, animations, video games, and any other moving image data.”

Read more on Flowing Data: Explore large image collections with ImagePlot

futurejournalismproject:

If you’re a data junkie, O’Reilly Radar has a freebie for you: its collected writings on big data since June 2010 published as an ebook.
Called Big Data Now: Current Perspectives from O’Reilly Radar, the book covers:

Data issues — The opportunities and ambiguities of the data space are evident in this segment’s discussions around privacy, the implications of data-centric industries, and even in the debate about the phrase “data science” itself.
The application of data — An exploration of data applications showed that this segment is quickly expanding to include everything from data startups to established enterprises to media/journalism to education and research. A “data product” can emerge from virtually any domain.
Data science and data tools — The tools and technologies that drive data science are, of course, essential to this space, but the varied techniques being applied are also key to understanding the big data arena.
The business of data — This is all about the actions connected to data — the process of finding, organizing, and analyzing data that allows organizations of all sizes to improve and innovate.

Download (via O’Reilly).

futurejournalismproject:

If you’re a data junkie, O’Reilly Radar has a freebie for you: its collected writings on big data since June 2010 published as an ebook.

Called Big Data Now: Current Perspectives from O’Reilly Radar, the book covers:

Data issues — The opportunities and ambiguities of the data space are evident in this segment’s discussions around privacy, the implications of data-centric industries, and even in the debate about the phrase “data science” itself.

The application of data — An exploration of data applications showed that this segment is quickly expanding to include everything from data startups to established enterprises to media/journalism to education and research. A “data product” can emerge from virtually any domain.

Data science and data tools — The tools and technologies that drive data science are, of course, essential to this space, but the varied techniques being applied are also key to understanding the big data arena.

The business of data — This is all about the actions connected to data — the process of finding, organizing, and analyzing data that allows organizations of all sizes to improve and innovate.

Download (via O’Reilly).

A quick data visualization I made for this business story.

I <3 IBM’s Many Eyes.

futurejournalismproject:

IBM’s Visual Communications Lab created a tool that visualizes who’s writing what at the New York Times.

Via the VCL blog:

You begin by performing a search for a topic of interest. Pick a keyword you’re interested, such as “Tsunami”. This will fetch articles containing that term that were written in the last 30 days and build the visualization from them.

The above is the result for our search for “journalism.” The results, as explained by the VCL:

Each bubble represents a single human-created tag describing an article. The size corresponds to the overall frequency that specific tag was used to describe articles about your query by labeling a related article.

When you hover over a tag’s bubble you will see the other tags it was used with. The thickness of that connection will imply how frequently that pairing occurred.

You can play with NYT Writes here.

H/T: Flowing Data.

visualturn:

“Poverty is a more powerful influence on test scores than value added by teachers and schools.” University of Texas physics Prof. Michael Marden’s visualization of the correlation between low SAT scores, poverty, and race.

Journalism in the Age of Data from Geoff McGhee on Vimeo:

Journalists are coping with the rising information flood by borrowing data visualization techniques from computer scientists, researchers and artists. Some newsrooms are already beginning to retool their staffs and systems to prepare for a future in which data becomes a medium. But how do we communicate with data, how can traditional narratives be fused with sophisticated, interactive information displays?

Watch the full version with annotations and links.